Method and apparatus for determining solver, and computing device
The key decision algorithm for generating a mixed integer linear programming (MILP) problem solver by using a large language model addresses the problem of reliance on expert experience, improves solution efficiency and accuracy, and achieves full exploration of the algorithm space and optimization of the target algorithm.
Patent Information
- Authority / Receiving Office
- WO · WO
- Patent Type
- Applications
- Current Assignee / Owner
- HUAWEI TECH CO LTD
- Filing Date
- 2025-12-16
- Publication Date
- 2026-07-02
AI Technical Summary
In the existing technology, the key decision algorithm design of mixed integer linear programming (MILP) problem solvers relies on expert experience, resulting in high time and human resource costs. Furthermore, expert bias limits the exploration of the algorithm space, affecting the solution efficiency and accuracy.
By leveraging a large language model to generate key decision-making algorithms in the solver, and by acquiring evaluation metrics and template information of historical algorithms, combined with population evolution mechanisms, optimized algorithms are selected, avoiding reliance on expert experience and improving the efficiency of algorithm space exploration.
It saves time and manpower costs, improves the solver's efficiency and accuracy, fully explores the algorithm space, and generates target algorithms that are more suitable for user problems.
Smart Images

Figure CN2025142793_02072026_PF_FP_ABST
Abstract
Description
Methods, apparatus, and computing equipment for determining the solver
[0001] This application claims priority to Chinese Patent Application No. 202411979116.4, filed with the China National Intellectual Property Administration on December 27, 2024, entitled “Method, Apparatus and Computing Device for Determining a Solver”, the entire contents of which are incorporated herein by reference. Technical Field
[0002] This application relates to the field of mathematical programming, and more specifically, to a method, apparatus, and computing device for determining a solver. Background Technology
[0003] Operations research is a newly emerging interdisciplinary field that developed in the 1930s and 40s. It primarily studies the use and planning of various resources by humankind, aiming to maximize the efficiency of limited resources and achieve overall optimization under certain constraints. Mathematical programming is an important branch of operations research, mainly studying how to find the optimal solution that minimizes or maximizes a certain function within a given region.
[0004] Solving mixed-integer linear programming (MILP) problems is the core of mathematical programming solvers. The essence of solving MILP problems is to obtain the optimal solution by solving a series of linear programming problems. Each core step in MILP solving involves a key decision algorithm, and the selection and design of this algorithm significantly impacts the overall efficiency of the solver.
[0005] In one related technical solution, key decision-making algorithms involved in each step of the solver are designed using expert experience. These algorithms are then embedded into the mathematical programming solver to solve specific examples. Based on the performance on the dataset, the key decision-making algorithms embedded in the solver are adjusted using expert experience and then embedded back into the mathematical programming solver to continue the solution process.
[0006] In the aforementioned technical solutions, the quality of the key decision algorithm embedded in the mathematical programming solver depends entirely on expert experience. On the one hand, designing entirely new key decision algorithms based on expert experience or adjusting existing key decision algorithms based on solution results requires significant time and manpower. On the other hand, experts designing and adjusting key decision algorithms in the solver based on their past experience inevitably introduces their personal prior biases, which to some extent limits the explorable algorithm space.
[0007] Therefore, improving the solver's efficiency and accuracy has become an urgent technical problem to be solved. Summary of the Invention
[0008] This application provides a method, apparatus, and computing device for determining a solver, which can improve the solving efficiency and accuracy of the solver.
[0009] In a first aspect, a method for determining a solver is provided, comprising: obtaining context information of a prompt from a large language model, the context including evaluation metrics for at least one historical algorithm; outputting at least one new algorithm using the large language model based on the prompt; determining the evaluation metrics corresponding to each of the at least one new algorithm; and selecting an algorithm set from the at least one historical algorithm and the at least one new algorithm according to the evaluation metrics corresponding to each of the at least one new algorithm and the evaluation metrics corresponding to each of the at least one historical algorithm, wherein the algorithm set includes at least one target algorithm, the at least one target algorithm being used by the solver to solve a received problem to be solved, so as to obtain a solution to the problem to be solved.
[0010] The above technical solution avoids the need to design algorithms in the solver based on expert experience, which not only saves time and manpower costs, but also allows for full exploration of the overall algorithm space using a large language model, thereby improving the accuracy of the solver.
[0011] In conjunction with the first aspect, in some implementations of the first aspect, the context includes the at least one historical algorithm and its respective evaluation metric.
[0012] In the above technical solution, at least one historical algorithm can be directly carried in the context, which can improve the efficiency of the large language model in outputting at least one new algorithm.
[0013] In conjunction with the first aspect, in some implementations of the first aspect, at least one first historical problem is received; the at least one first historical problem is solved based on the at least one new algorithm to obtain at least one first solution corresponding to the at least one first historical problem, and the evaluation index of the at least one new algorithm is determined.
[0014] In the above technical solution, at least one evaluation index of each new algorithm is obtained through at least one first historical problem. In this way, when a solution is subsequently received that is in the same domain or distribution as the at least one first historical problem, it has greater reference value when solving the solution problem.
[0015] In conjunction with the first aspect, in some implementations of the first aspect, before obtaining the context information of the prompt of the large language model, the method further includes: receiving at least one second historical problem; solving the at least one second historical problem based on the at least one historical algorithm respectively, obtaining at least one second solution corresponding to each of the at least one second historical problem, and determining the evaluation index of each of the at least one historical algorithm.
[0016] In the above technical solution, at least one evaluation index of each historical algorithm is determined before obtaining the context. This avoids recalculating the evaluation index of each historical algorithm during the current iteration, thereby improving the efficiency of obtaining at least one target algorithm.
[0017] In conjunction with the first aspect, in some implementations of the first aspect, the algorithm set is selected from the at least one historical algorithm and the at least one new algorithm based on the evaluation index corresponding to each of the at least one new algorithm and the evaluation index corresponding to each of the at least one historical algorithm, based on the mechanism of population evolution.
[0018] In the above technical solution, based on the population evolution mechanism and following the principle of survival of the fittest, algorithms with poor evaluation indicators among at least one historical algorithm and at least one new algorithm can be eliminated according to the evaluation indicators corresponding to at least one new algorithm and at least one historical algorithm, while algorithms with better evaluation indicators are retained. This helps to discover the target algorithm from at least one historical algorithm and at least one new algorithm. The target algorithm is the recommended algorithm that may be more suitable for solving the user's problem.
[0019] In conjunction with the first aspect, in some implementations of the first aspect, the context also includes template information of the at least one historical algorithm, which includes at least one of the following: input information, output information, and decision content of the historical algorithm.
[0020] In the above technical solution, the template information of historical algorithms can be put into the context information input of the large language model, so that the large language model can output the relevant information of the new algorithm based on the template information of historical algorithms.
[0021] In conjunction with the first aspect, in some implementations of the first aspect, the prompt is used to instruct the large language model to use the evaluation metrics of each of the at least one historical algorithm as supervision information to generate the at least one new algorithm; the context information and the prompt information are input into the large language model, and the large language model outputs the at least one new algorithm.
[0022] In the above technical solution, template information of historical algorithms is incorporated into the context information input of the large language model, enabling the large language model to output relevant information of the new algorithm based on the template. This facilitates the integration of the new algorithm into the solver to solve the problem. The evaluation metrics of at least one historical algorithm are used as supervision information to increase the probability that the large language model will generate at least one high-quality new algorithm.
[0023] In conjunction with the first aspect, in some implementations of the first aspect, the at least one historical algorithm and the at least one new algorithm have the same function.
[0024] In conjunction with the first aspect, in some implementations of the first aspect, the method further includes: receiving configuration data input by the user, the configuration data including configuration parameters of the large language model.
[0025] In the above technical solution, users can input configuration data, which makes the generated target algorithm better meet the user's business needs.
[0026] Secondly, an apparatus for determining a solver is provided, comprising: an acquisition module, a generation module, a determination module, and a filtering module. The acquisition module acquires context information of a prompt from a large language model, the context including evaluation metrics for at least one historical algorithm. The generation module uses the large language model based on the prompt to output at least one new algorithm. The determination module determines the evaluation metrics corresponding to each of the at least one new algorithm. The filtering module filters an algorithm set from the at least one historical algorithm and the at least one new algorithm based on the evaluation metrics corresponding to the at least one new algorithm and the evaluation metrics corresponding to the at least one historical algorithm. The algorithm set includes at least one target algorithm, which is used by the solver to solve a received problem to obtain a solution to the problem.
[0027] In conjunction with the second aspect, in some implementations of the second aspect, the context includes the at least one historical algorithm and its respective evaluation metric.
[0028] In conjunction with the second aspect, in some implementations of the second aspect, the determining module is specifically used to: receive at least one first historical problem; solve the at least one first historical problem based on the at least one new algorithm respectively, obtain at least one first solution corresponding to the at least one first historical problem, and determine the evaluation index of each of the at least one new algorithm.
[0029] In conjunction with the second aspect, in some implementations of the second aspect, the acquisition module is further configured to receive at least one second historical problem; the acquisition module is further configured to solve the at least one second historical problem based on the at least one historical algorithm respectively, to obtain at least one second solution corresponding to each of the at least one second historical problem, and to determine the evaluation index of each of the at least one historical algorithm.
[0030] In conjunction with the second aspect, in some implementations of the second aspect, the screening module is specifically used to: select the algorithm set from the at least one historical algorithm and the at least one new algorithm based on the evaluation index corresponding to each of the at least one new algorithm and the evaluation index corresponding to each of the at least one historical algorithm, based on the mechanism of population evolution.
[0031] In conjunction with the second aspect, in some implementations of the second aspect, the context also includes template information of the at least one historical algorithm, which includes at least one of the following: input information, output information, and decision content of the historical algorithm.
[0032] In conjunction with the second aspect, in some implementations of the second aspect, the prompt is used to instruct the large language model to use the evaluation metrics of each of the at least one historical algorithm as supervision information to generate the at least one new algorithm; the generation module is specifically used to: input the context information and the prompt information into the large language model, and the large language model outputs the at least one new algorithm.
[0033] In conjunction with the second aspect, in some implementations of the second aspect, the at least one historical algorithm and the at least one new algorithm have the same function.
[0034] In conjunction with the second aspect, in some implementations of the second aspect, the acquisition module is also used to receive configuration data input by the user, which includes configuration parameters of the large language model.
[0035] It should be understood that for the beneficial effects of the second aspect and its various implementations, please refer to the first aspect and its various implementations; they will not be repeated here.
[0036] Thirdly, a computing device is provided, including a processor and a memory, and optionally, an input / output interface. The processor controls the input / output interface to send and receive information, the memory stores a computer program, and the processor retrieves and runs the computer program from the memory, causing the computing device to execute the method of the first aspect or any possible implementation thereof.
[0037] Optionally, the processor can be a general-purpose processor, which can be implemented in hardware or software. When implemented in hardware, the processor can be a logic circuit, integrated circuit, etc.; when implemented in software, the processor can be a general-purpose processor that reads software code stored in memory. This memory can be integrated into the processor or located outside the processor and exist independently.
[0038] Fourthly, a computing device cluster is provided, including at least one computing device, each computing device including a processor and a memory; the processor of the at least one computing device is configured to execute instructions stored in the memory of the at least one computing device, such that the computing device cluster performs the method of the first aspect or any possible implementation thereof.
[0039] Fifthly, a chip is provided that acquires and executes instructions to implement the methods described in the first aspect and any implementation thereof.
[0040] Optionally, as one implementation, the chip includes a processor and a data interface, through which the processor reads instructions stored in the memory and executes the methods in the first aspect and any implementation thereof.
[0041] Optionally, as one implementation, the chip may further include a memory storing instructions, and the processor is used to execute the instructions stored in the memory. When the instructions are executed, the processor is used to perform the method in the first aspect and any implementation thereof.
[0042] In a sixth aspect, a computer program product containing instructions is provided, which, when executed by a computing device, cause the computing device to perform the methods described in the first aspect and any implementation thereof.
[0043] In a seventh aspect, a computer program product containing instructions is provided, which, when run by a cluster of computing devices, cause the cluster of computing devices to perform the methods described in the first aspect and any implementation thereof.
[0044] Eighthly, a computer-readable storage medium is provided, including computer program instructions that, when executed by a computing device, perform the method as described in the first aspect and any implementation thereof.
[0045] As examples, these computer-readable storage devices include, but are not limited to, one or more of the following: read-only memory (ROM), programmable ROM (PROM), erasable PROM (EPROM), flash memory, electrically EPROM (EEPROM), and hard drive.
[0046] Alternatively, as one implementation method, the aforementioned storage medium can specifically be a non-volatile storage medium.
[0047] A ninth aspect provides a computer-readable storage medium including computer program instructions that, when executed by a cluster of computing devices, perform the method as described in the first aspect and any implementation thereof.
[0048] As examples, these computer-readable storage devices include, but are not limited to, one or more of the following: read-only memory (ROM), programmable ROM (PROM), erasable PROM (EPROM), flash memory, electrically EPROM (EEPROM), and hard drive.
[0049] Alternatively, as one implementation method, the aforementioned storage medium can specifically be a non-volatile storage medium. Attached Figure Description
[0050] Figure 1 is a schematic block diagram of a cloud scenario applicable to an embodiment of this application.
[0051] Figure 2 is a schematic flowchart of a method for determining a solver provided in an embodiment of this application.
[0052] Figure 3 is a schematic block diagram of a system architecture provided in an embodiment of this application.
[0053] Figure 4 is a schematic block diagram of a solver determination device 400 provided in an embodiment of this application.
[0054] Figure 5 is a schematic block diagram of an apparatus 400 for deploying a deterministic solver provided in an embodiment of this application.
[0055] Figure 6 is a schematic diagram of the architecture of a computing device 1500 provided in an embodiment of this application.
[0056] Figure 7 is a schematic diagram of the architecture of a computing device cluster provided in an embodiment of this application.
[0057] Figure 8 is a schematic diagram of the connection between computing devices 1500A and 1500B via a network provided in an embodiment of this application. Detailed Implementation
[0058] The technical solutions in this application will now be described with reference to the accompanying drawings.
[0059] This application will present various aspects, embodiments, or features relating to systems comprising multiple devices, components, modules, etc. It should be understood and appreciated that individual systems may include additional devices, components, modules, etc., and / or may not include all devices, components, modules, etc. discussed in conjunction with the accompanying drawings. Furthermore, combinations of these approaches are also possible.
[0060] Furthermore, in the embodiments of this application, the words "exemplary," "for example," etc., are used to indicate that they are examples, illustrations, or descriptions. Any embodiment or design scheme described as "exemplary" in this application should not be construed as being more preferred or advantageous than other embodiments or design schemes. Specifically, the use of the term "exemplary" is intended to present the concept in a concrete manner.
[0061] In the embodiments of this application, "corresponding" and "corresponding" can sometimes be used interchangeably. It should be noted that when the distinction is not emphasized, their intended meanings are consistent.
[0062] The business scenarios described in the embodiments of this application are for the purpose of more clearly illustrating the technical solutions of the embodiments of this application, and do not constitute a limitation on the technical solutions provided in the embodiments of this application. As those skilled in the art will know, with the evolution of network architecture and the emergence of new business scenarios, the technical solutions provided in the embodiments of this application are also applicable to similar technical problems.
[0063] References to "one embodiment" or "some embodiments" as described in this specification mean that one or more embodiments of this application include a specific feature, structure, or characteristic described in connection with that embodiment. Therefore, the phrases "in one embodiment," "in some embodiments," "in other embodiments," "in still other embodiments," etc., appearing in different parts of this specification do not necessarily refer to the same embodiment, but rather mean "one or more, but not all, embodiments," unless otherwise specifically emphasized. The terms "comprising," "including," "having," and variations thereof mean "including but not limited to," unless otherwise specifically emphasized.
[0064] In this application, "at least one" means one or more, and "more than one" means two or more. "And / or" describes the relationship between related objects, indicating that three relationships can exist. For example, A and / or B can mean: A alone, A and B simultaneously, and B alone, where A and B can be singular or plural. The character " / " generally indicates that the preceding and following related objects are in an "or" relationship. "At least one of the following" or similar expressions refer to any combination of these items, including any combination of single or plural items. For example, at least one of a, b, or c can mean: a, b, c, ab, ac, bc, or abc, where a, b, and c can be single or multiple.
[0065] To facilitate understanding, the relevant terms and concepts that may be involved in the embodiments of this application will be introduced below.
[0066] 1. Operations Research
[0067] Operations research is a newly emerging interdisciplinary field that developed in the 1930s and 40s. It primarily studies the use and planning of various resources by humankind, aiming to maximize the benefits of limited resources and achieve overall optimization under certain constraints. For example, operations research mainly uses mathematical methods to study optimization paths and schemes for various systems, providing decision-makers with a basis for scientific decision-making.
[0068] 2. Mathematical Programming
[0069] Mathematical programming is an important branch of operations research. It mainly studies how to find the optimal solution that minimizes or maximizes a certain function in a given region.
[0070] Mathematical programming has extremely wide applications. Depending on the nature of the problem and the methods used, it can be divided into many different branches, such as linear programming, integer programming, nonlinear programming, combinatorial optimization, multi-objective programming, stochastic programming, dynamic programming, and parametric programming. These problems have numerous applications in many fields, including supply chain management, finance, transportation, and communications.
[0071] 3. Mathematical Programming Solver
[0072] It can also be simply called a solver. This solver is mainly used to solve mathematical programming problems. The industry has developed specialized software systems for linear, integer, mixed integer and various nonlinear programming models, namely mathematical programming solvers.
[0073] 4. Mixed Integer Linear Programming (MILP)
[0074] Solving MILP problems is the core of mathematical programming solvers. Linear programming is an important branch of operations research that has been studied early, developed rapidly, widely applied, and has mature methods. It is a mathematical theory and method that assists in studying the extremum problems of linear objective functions under linear constraints. In many real-world scenarios, linear programming problems always have integer constraints, such as production scheduling, supply chain management, production scheduling, and factory location problems. These problems can usually be modeled as mixed-integer linear programming (MILP) problems.
[0075] Solving MILP problems essentially involves solving a series of linear programming problems to obtain the optimal solution. The core steps of MILP solving in a solver include, but are not limited to:
[0076] 1) Branch: Based on the initial solution, select variables to solve the problem again;
[0077] 2) Cut plane: Based on the solution obtained after relaxation, a series of linear constraints are generated, and some of the linear constraints are selected and added to the original problem to narrow the feasible region of the solution;
[0078] 3) Primal heuristics: Using heuristics, a feasible solution can be obtained directly without fully solving the MILP problem.
[0079] In the aforementioned steps such as branching, cutting planes, and primitive heuristics, there are key decision algorithms. For example, in the branching step, the key decision algorithm is an algorithm for selecting and pruning nodes. Similarly, in the cutting plane step, the key decision algorithm is an algorithm for selecting the cutting plane. And in the primitive heuristic step, the key decision algorithm is an algorithm for selecting variables from the diving heuristic within the primitive heuristic.
[0080] The selection and design of key decision-making algorithms involved in each of the above steps will significantly affect the overall solution efficiency of the solver.
[0081] In one related technical solution, key decision-making algorithms involved in each step of the solver are designed using expert experience. These algorithms are then embedded into the mathematical programming solver to solve specific problems. Based on the solver's performance, the embedded key decision-making algorithms are adjusted using expert experience, and the adjusted algorithms are then re-embedded into the solver to continue the subsequent solution process.
[0082] In the aforementioned technical solutions, the quality of the critical decision algorithms embedded in the mathematical programming solver relies entirely on expert experience. On one hand, critical decision algorithms in mathematical programming solvers, such as branching, cutting plane, and primal heuristics, typically involve complex logic control and numerical computation. Designing entirely new critical decision algorithms based on expert experience or adjusting existing ones based on solution results requires significant time and manpower. On the other hand, experts designing and adjusting critical decision algorithms based on their past experience inevitably introduces their personal prior biases. This, to some extent, limits the explorable algorithmic space, preventing critical decision algorithms designed using expert experience from fully exploring the overall algorithmic space.
[0083] In view of this, embodiments of this application provide a method for determining a solver. This method can automatically generate key decision algorithms involved in the solver using a large language model, enabling the solver to solve the user-input problem based on the optimal key decision algorithms automatically generated by the large language model, thereby obtaining a solution to the problem. This avoids relying on expert experience to design key decision algorithms in the solver, saving time and manpower costs. Furthermore, it allows for a thorough exploration of the overall algorithm space using a large language model, improving the accuracy of solving the problem using the solver.
[0084] The method for determining the solver described above can be executed on the user's end device or on the cloud device; this application embodiment does not specifically limit this.
[0085] It should be understood that the aforementioned end-side devices may include, but are not limited to, the user's mobile phone, computer, server, and other devices.
[0086] It should also be understood that the aforementioned cloud-side devices can be servers located in cloud data centers. For ease of understanding, the following description will first consider a cloud scenario that includes a cloud data center.
[0087] For example, Figure 1 is a schematic block diagram of a cloud scenario applicable to an embodiment of this application. As shown in Figure 1, the cloud scenario may include: a cloud management platform 110, the Internet 120, and a client 130.
[0088] As shown in Figure 1, the cloud management platform 110 is used to manage the infrastructure that provides multiple cloud services. The infrastructure includes multiple cloud data centers, each cloud data center includes multiple servers, and each server includes cloud service resources to provide corresponding cloud services to tenants.
[0089] The cloud management platform 110 can be located in a cloud data center and provides access interfaces (such as user interfaces or application program interfaces, APIs). Tenants can use client 130 to remotely access the cloud management platform 110, register a cloud account and password, and log in. After successful authentication of the cloud account and password, the tenant can further select and purchase virtual machines of specific specifications (processor, memory, disk) on the cloud management platform 110. After successful purchase, the cloud management platform 110 provides the remote login account and password for the purchased virtual machine, allowing client 130 to remotely log in and install and run the tenant's applications. Therefore, tenants can create, manage, log in to, and operate virtual machines in the cloud data center through the cloud management platform 110. Virtual machines can also be referred to as Elastic Compute Service (ECS) or Elastic Instances (different cloud service providers may use different names).
[0090] It should be understood that cloud service tenants can be individuals, businesses, schools, hospitals, government agencies, etc.
[0091] The cloud management platform 110 includes, but is not limited to, a user console, compute management services, network management services, storage management services, authentication services, and image management services. The user console provides an interface or API for interaction with tenants. The compute management services manage servers running virtual machines and containers, as well as bare metal servers. The network management services manage network services (such as gateways and firewalls). The storage management services manage storage services (such as data bucket services). The authentication services manage tenant account passwords. The image management services manage virtual machine images. Tenants can log in to the cloud management platform 110 via client 130 and the internet 120 to manage their rented cloud services.
[0092] The method for generating an objective solver provided by an embodiment of this application will be described in detail below with reference to Figure 2. It should be understood that the examples in Figure 2 are merely to help those skilled in the art understand the embodiments of this application, and are not intended to limit the embodiments to the specific numerical values or scenarios illustrated in Figure 2. Those skilled in the art can obviously make various equivalent modifications or variations based on the examples given below in Figure 2, and such modifications and variations also fall within the scope of the embodiments of this application.
[0093] Figure 2 is a schematic flowchart of a method for determining a solver according to an embodiment of this application. As shown in Figure 2, the method may include steps 210-240, which are described below.
[0094] Step 210: Obtain the context information of the prompt from the large language model.
[0095] In this embodiment, the context information of the prompt of the large language model can be obtained, which includes at least one evaluation metric of each of the historical algorithms.
[0096] For example, as shown in Figure 3, the above context can be obtained through the context learning module, which can obtain the evaluation metrics of at least one historical algorithm from the algorithm population.
[0097] It should be understood that the aforementioned prompt is an injected instruction used to "direct" the large language model to think about the problem and output content according to a preset approach. It is an instruction or message that can guide or trigger the large language model to respond.
[0098] For example, the prompt above may include: a task specified by the user for the large language model to perform, and a format specified by the user for the output information. For instance, the task specified by the user for the large language model to perform could be: generating at least one new algorithm based on the evaluation metrics of at least one historical algorithm from the context information, using these metrics as supervision information. The format of the output information specified by the user could be, for example, pseudocode that the at least one new algorithm output can be executed by the solver.
[0099] The aforementioned historical algorithm can be an algorithm generated during the first n (n is an integer greater than 1) iterations of the current round, or it can be an algorithm generated during multiple iterations of the first m rounds. This application does not make any specific limitations on this.
[0100] The aforementioned historical algorithm can be a key decision algorithm in any one or more core steps of the solver, and this application does not make any specific limitations on it.
[0101] The evaluation metrics for each of the above-mentioned at least one historical algorithm can be the performance evaluation metrics for each of the at least one historical algorithm, or they can be other evaluation metrics for each of the at least one historical algorithm. This application does not make any specific limitations on this.
[0102] Optionally, before step 210, at least one second historical problem may be received, and the at least one second historical problem may be solved based on the at least one historical algorithm mentioned above to obtain at least one second solution corresponding to each of the at least one second historical problem, and to determine the evaluation index of each of the at least one historical algorithm.
[0103] For example, as shown in Figure 3, the multi-objective algorithm evaluation module can solve the at least one second historical problem based on the aforementioned at least one historical algorithm, obtaining at least one second solution for each of the at least one second historical problem, and determining the evaluation index of each of the at least one historical algorithm. The multi-objective algorithm evaluation module can also store the evaluation index of each of the at least one historical algorithm in the algorithm population.
[0104] It should be understood that the embodiments of this application do not specifically limit the above-mentioned at least one second historical problem. The at least one second historical problem may include, but is not limited to, at least one of the following: the historical problem used in the first n iterations of this round, the historical problem used in the first m iterations, and the problem to be solved received in the first m iterations.
[0105] In this embodiment of the application, the evaluation index of each of the at least one historical algorithm can be determined in the current iteration of the current round; or, the evaluation index of each of the at least one historical algorithm can be determined in the previous n iterations of the current round, and the evaluation index of each of the at least one historical algorithm can be recorded.
[0106] There are multiple ways to determine the evaluation metrics for at least one historical algorithm, and this application does not specifically limit these methods. The following examples illustrate different implementations for determining the evaluation metrics for at least one historical algorithm.
[0107] In one possible implementation, taking the performance metric as the solution error, the solution error of each of the at least one historical algorithm can be determined based on at least one second solution corresponding to each of the at least one second historical problem. For example, the solution error of each of the at least one historical algorithm can be determined by the difference between each of the at least one second solution corresponding to each of the at least one second historical problem and the feasible solution of the at least one second historical problem.
[0108] In another possible implementation, taking the solution time as the performance indicator, after solving the at least one second historical problem based on the above-mentioned at least one historical algorithm to obtain at least one second solution corresponding to each of the at least one second historical problem, the time to obtain at least one second solution corresponding to each of the at least one second historical problem can be calculated, and the time to obtain at least one second solution can be used as the solution time corresponding to each of the at least one historical algorithm.
[0109] In another possible implementation, taking the solution convergence speed as the performance index, after solving the at least one second historical problem based on the above-mentioned at least one historical algorithm to obtain at least one second solution corresponding to each of the at least one second historical problem, the convergence speed of the at least one second solution corresponding to each of the at least one second historical problem can be calculated respectively, and the convergence speed of the at least one second solution obtained respectively can be used as the solution convergence speed corresponding to each of the at least one historical algorithm.
[0110] It should be understood that the aforementioned performance metric of solution error is related to at least one second solution, while the aforementioned performance metric of solution convergence speed and solution time are not related to at least one second solution.
[0111] Step 220: Based on this prompt, use the large language model to output at least one new algorithm.
[0112] In this embodiment, the context information of the prompt obtained above can be input into a large language model, which uses the input context information to generate at least one new algorithm. The input information of the large language model includes the context information, and the output information of the large language model includes the at least one new algorithm.
[0113] As an example, the above context includes at least one historical algorithm and at least one evaluation metric for each historical algorithm, and the large language model can generate at least one new algorithm based on the at least one historical algorithm and at least one evaluation metric for each historical algorithm.
[0114] In another example, the aforementioned context includes the name of at least one historical algorithm and the evaluation metric of each historical algorithm. The large language model can obtain the at least one historical algorithm based on its name and generate at least one new algorithm based on the name of the at least one historical algorithm and the evaluation metric of each historical algorithm.
[0115] For example, a large language model can retrieve at least one historical algorithm stored in a database based on the name of at least one historical algorithm.
[0116] For example, the large language model is a large language model that has been fine-tuned by at least one historical algorithm. The large language model can directly determine at least one historical algorithm based on the name of at least one historical algorithm.
[0117] It should be understood that the above-mentioned at least one historical algorithm refers to the pseudocode corresponding to each of the at least one historical algorithm. The solver can solve the problem to be solved by running the pseudocode.
[0118] For example, as shown in Figure 3, the acquired contextual information can be input into the large language model through the context learning module.
[0119] It should be understood that the aforementioned large language model is a pre-trained model with context learning capabilities. This context learning capability refers to the ability of the large language model to understand and generate text that conforms to the given context information without requiring any changes to the model weights. This capability enables the large language model to more accurately grasp the semantics, sentiment, and intent of text when processing natural language tasks, thereby generating more logical and coherent text.
[0120] This application does not specifically limit the architecture of the large language model described above, as long as the large language model has the ability to learn context. For example, the large language model can be based on the following architectures: Transformer architecture, Encoder-Decoder architecture, Convolutional Neural Network (CNN) architecture, CNN-Transformer architecture, Sonnet architecture, Generative Pre-trained Transformer (GPT) architecture, etc.
[0121] Optionally, in some embodiments, the context information further includes template information of at least one historical algorithm, which includes at least one of the following: input information, output information, and decision content of the historical algorithm.
[0122] For example, as shown in Figure 3, template information of at least one historical algorithm can be obtained through a key code recognition module. For instance, this key code recognition module is used to obtain solution logs of the solver solving at least one second historical problem based on the aforementioned at least one historical algorithm, and to obtain template information of the aforementioned at least one historical algorithm by analyzing these solution logs.
[0123] The input and output information of the aforementioned historical algorithm can be parameters or logic within that algorithm. For example, taking the key decision algorithm for selecting a cutting plane in the cutting plane step of the aforementioned historical algorithm as a solver, the input information of this historical algorithm includes multiple input cutting planes, the output information of this historical algorithm includes the target cutting plane selected from the multiple input cutting planes, and the decision content of this historical algorithm includes selecting the target cutting plane from the multiple input cutting planes based on a cutting plane selection strategy.
[0124] For example, taking the key decision algorithm for selecting nodes in the branching steps of the above-mentioned historical algorithm as a solver as an example, the input information of the historical algorithm includes multiple input nodes, the output information of the historical algorithm includes the target node selected from the multiple input nodes, and the decision content of the historical algorithm includes selecting the target node from the multiple input nodes based on the node selection strategy.
[0125] Optionally, in some embodiments, the above prompt is used to instruct the large language model to use the evaluation metrics of each of the above at least one historical algorithm as supervision information so that it can generate the above at least one new algorithm.
[0126] In the above technical solution, the evaluation metrics of at least one historical algorithm are used as supervision information so that the large language model has a greater probability of generating at least one high-quality new algorithm.
[0127] As an example, at least one historical algorithm and at least one new algorithm output by the large language model have the same function. It should be understood that at least one historical algorithm and at least one new algorithm having the same function means that: the input information of at least one historical algorithm and at least one new algorithm are the same, or the output information of at least one historical algorithm and at least one new algorithm are the same, or the core decision content of at least one historical algorithm and at least one new algorithm is the same.
[0128] For example, taking at least one historical algorithm and at least one new algorithm as the key decision-making algorithms for selecting nodes, the core decision-making content of the at least one historical algorithm and at least one new algorithm includes selecting the target node from multiple input nodes based on the node selection strategy.
[0129] For example, taking at least one historical algorithm and at least one new algorithm as the key decision-making algorithms for selecting a cutting plane, the core decision-making content of the at least one historical algorithm and at least one new algorithm includes selecting a target cutting plane from multiple input cutting planes based on a cutting plane selection strategy.
[0130] It should be noted that the above example uses a large language model generating a key decision algorithm for one function in one iteration. In addition, the large language model can also generate key decision algorithms for multiple functions simultaneously in one iteration. In this case, the context information input to the large language model includes historical algorithms for multiple functions. This application does not specifically limit this aspect.
[0131] Step 230: Determine the evaluation metrics for at least one new algorithm.
[0132] In one implementation, at least one first historical problem can be received, and the at least one first historical problem can be solved based on the at least one new algorithm mentioned above to obtain at least one first solution corresponding to each of the at least one first historical problem, and the evaluation index of each of the at least one new algorithm can be determined.
[0133] The evaluation metrics for each of the above-mentioned at least one new algorithm can be the performance evaluation metrics for each of the at least one new algorithm, or they can be other evaluation metrics for each of the at least one new algorithm. This application does not make any specific limitations in this regard.
[0134] It should be understood that the embodiments of this application do not specifically limit the above-mentioned at least one first historical problem. The at least one first historical problem may include, but is not limited to, at least one of the following: the historical problem used in the first n iterations of this round, the historical problem used in the first m iterations, and the problem to be solved received in the first m iterations.
[0135] The above-mentioned at least one first historical problem and the above-mentioned at least one second historical problem may be the same or different, and the embodiments of this application do not specifically limit this.
[0136] In this application, there are multiple ways to determine the evaluation metrics for each of the at least one new algorithm. The following examples illustrate different implementations for determining the evaluation metrics for each of the at least one new algorithm.
[0137] In one possible implementation, taking the performance metric as the solution error, the solution error of each of the at least one mental algorithm can be determined based on at least one first solution corresponding to each of the at least one first historical problem. For example, the solution error of each of the at least one mental algorithm can be determined by the difference between each of the at least one first solution corresponding to each of the at least one first historical problem and the feasible solution of the at least one first historical problem.
[0138] In another possible implementation, taking the solution time as the performance indicator, after solving the at least one first historical problem based on the above-mentioned at least one new algorithm to obtain at least one first solution corresponding to each of the at least one first historical problem, the time to obtain at least one first solution corresponding to each of the at least one first historical problem can be calculated, and the time to obtain at least one first solution can be used as the solution time corresponding to each of the at least one new algorithm.
[0139] In another possible implementation, taking the solution convergence speed as the performance index, after solving the at least one first historical problem based on the above at least one new algorithm to obtain at least one first solution corresponding to each of the at least one first historical problem, the convergence speed of the at least one first solution corresponding to each of the at least one first historical problem can be calculated respectively, and the convergence speed of the at least one first solution obtained respectively can be used as the solution convergence speed corresponding to each of the at least one new algorithm.
[0140] It should be understood that the aforementioned performance metric of solution error is related to at least one first solution, while the aforementioned performance metric of solution convergence speed and solution time are not related to at least one first solution.
[0141] Step 240: Select an algorithm set from at least one historical algorithm and at least one new algorithm based on the evaluation metrics corresponding to each of the at least one new algorithm and the evaluation metrics corresponding to each of the at least one historical algorithm.
[0142] In this embodiment of the application, after obtaining the evaluation metrics corresponding to at least one new algorithm and at least one historical algorithm, an algorithm set can be selected from at least one historical algorithm and at least one new algorithm based on the evaluation metrics corresponding to at least one new algorithm and at least one historical algorithm. The algorithm set includes at least one target algorithm, which is used by the solver to solve the received problem to obtain a solution to the problem.
[0143] In one possible implementation, the aforementioned set of algorithms can be selected from at least one historical algorithm and at least one new algorithm based on a population evolution mechanism. For example, using the population evolution mechanism, based on the evaluation metrics corresponding to at least one new algorithm and at least one historical algorithm, algorithms with poor evaluation metrics among at least one historical algorithm and at least one new algorithm are eliminated, while algorithms with good evaluation metrics are retained.
[0144] The problem to be solved may be received by the solver or by the device that executes the above method flow (e.g., the device that determines the solver). This application embodiment does not specifically limit this.
[0145] The problem to be solved can be input directly by the user to the solver or the device that determines the solver, or it can be passed to the solver or the device that determines the solver by other modules (such as computer-aided engineering (CAE) simulation modules). This application does not specifically limit this.
[0146] In the above technical solution, at least one historical algorithm and its respective evaluation index are input as contextual information into a large language model. Utilizing the context learning capability of the large language model, at least one new algorithm is obtained from its output. Based on the evaluation indices corresponding to the at least one new algorithm and the at least one historical algorithm, at least one target algorithm is selected from the at least one historical algorithm and the at least one new algorithm. This allows the solver to use the at least one target algorithm to solve the received problem and obtain its solution. This avoids relying on expert experience to design the solver's algorithms, saving time and manpower costs. Furthermore, it allows the large language model to fully explore the overall algorithm space, improving the solver's accuracy.
[0147] In this embodiment of the application, after obtaining at least one target algorithm, it can be integrated into the solver. As an example, the at least one target algorithm can be embedded into the solver by updating the dynamic link library file. For instance, the solver's default dynamic link library file can be updated using methods such as LD_PRELOAD, thereby embedding the at least one target algorithm into the solver.
[0148] The aforementioned dynamic link library file can be a .so file on the Linux platform or a .dll file on the Windows platform; this application embodiment does not specifically limit it in this regard.
[0149] For example, as shown in Figure 3, after integrating at least one objective algorithm into the solver, the solver can solve the received problem. For instance, as shown in Figure 3, after receiving the problem, the solver can solve the problem based on the integrated at least one objective algorithm to obtain the optimal solution to the problem.
[0150] As an example, the following uses the open-source Solving Constraint Integer Programs (SCIP) solver as the basic mathematical programming solver framework, and takes the heuristic algorithm discovery in the solution process of mixed integer programming as the implementation scenario to verify the effectiveness of the embodiments of this application. It should be understood that heuristic algorithms are used in solving mixed integer programming problems to explore feasible solutions to the problem to be solved. The smaller the gap between the feasible solution and the optimal solution, the higher the quality of the feasible solution found.
[0151] For example, regarding the experimental datasets, this application uses four different artificially synthesized datasets as experimental datasets: set cover (500 constraints, 1000 variables), max independent set (4 affinity set, 500 nodes), combinatorial auction (100 items, 500 bids), and facility location (100 facilities).
[0152] For example, in terms of test metrics, the main comparison is between the solution performance metrics of different benchmarks and the embodiments of this application on the dataset. For example, the primary gap is specifically defined as the relative difference between the feasible solution and the optimal solution at the root node.
[0153] For example, in terms of comparison schemes, this application compares the best-performing human-designed diving heuristic algorithms in SCIP, such as the PseudoCost Diving algorithm and the Coefficient Diving algorithm, and denotes the performance of the human-designed algorithms on the dataset as Best Human-Designed. In addition, it also compares deep learning-based methods, denoted as L2DIVE.
[0154] The experimental results of this application's embodiments are shown in Table 1. Here, LLM4Solver refers to the algorithm obtained by the framework of this application through separate evolution on each dataset. LLM4Solver with MOEA refers to the algorithm obtained by the framework of this application using multi-objective evolution, simultaneously considering the performance of an algorithm on four datasets. Therefore, LLM4Solver with MOEA represents the performance of a single algorithm on four datasets, while the other rows represent the structures trained on the corresponding datasets. Observing the test results yields two conclusions: 1) The algorithm designed by the large language model (LLM)-based algorithm discovery framework can find better feasible solutions, outperforming the best manually designed algorithm (average improvement of over 50%) and deep learning-based algorithms (average improvement of 12%) on all four datasets; 2) The algorithm obtained through multi-objective evolution has stronger generalization ability, and the resulting single algorithm outperforms manually designed algorithms on all four datasets, and outperforms deep learning algorithms specifically trained on those datasets on two of them.
[0155] Table 1 Experimental Results
[0156] Another example is the use of the opt verse (OPTV) solver as the basic solver framework. The effectiveness of the proposed solution is verified by using the heuristic algorithm discovery process in the solution of a mixed integer programming problem as the implementation scenario.
[0157] For example, the experimental dataset primarily uses the Advanced Planning and Scheduling (APS) dataset. The testing metrics are: 1) Within a fixed solution time, comparing the difference between the primal bound and the dual bound (primal-dual gap); a smaller difference indicates higher quality of the feasible solution discovered by the heuristic algorithm; 2) Within a fixed solution time, comparing the primal bound of the objective function value of the found feasible solution; a smaller primal bound indicates better performance of the LLM algorithm.
[0158] For example, in terms of comparison schemes, this application compares another solver G with a solver D that does not utilize a large model for algorithm discovery.
[0159] The experimental results of this application embodiment on the APS dataset are shown in Table 2. It can be seen that the scheme provided by this application embodiment has a very significant improvement on the original solution. After adding the LLM discovery algorithm, the TianChou solver OPTV has a 13% improvement in the primal-dual gap index and a more than 90% improvement in the primal bound index, and can surpass the solver G.
[0160] Table 2 Experimental Results
[0161] The methods provided by the embodiments of this application have been described in detail above with reference to Figures 1 to 3. The embodiments of the apparatus of this application will now be described in detail below with reference to Figures 4 to 8. It should be understood that the descriptions of the method embodiments correspond to the descriptions of the apparatus embodiments; therefore, any parts not described in detail can be referred to the preceding method embodiments.
[0162] Figure 4 is a schematic block diagram of a solver determination device 400 provided in an embodiment of this application. The device 400 can be implemented by software, hardware, or a combination of both. The device 400 provided in this embodiment can implement the method flow shown in Figure 2 of this embodiment. The device 400 includes: an acquisition module 410, a generation module 420, a determination module 430, and a filtering module 440. The acquisition module 410 is used to acquire the context information of the prompt of a large language model, which includes evaluation metrics for at least one historical algorithm. The generation module 420 is used to use the large language model based on the prompt to output at least one new algorithm. The determination module 430 is used to determine the evaluation metrics corresponding to each of the at least one new algorithm. The filtering module 440 is used to filter an algorithm set from the at least one historical algorithm and the at least one new algorithm based on the evaluation metrics corresponding to the at least one new algorithm and the evaluation metrics corresponding to the at least one historical algorithm. The algorithm set includes at least one target algorithm, which is used by the solver to solve the received problem to obtain a solution to the problem.
[0163] Optionally, the context includes the at least one historical algorithm and its respective evaluation metric.
[0164] Optionally, the determining module 430 is specifically used to: receive at least one first historical problem; solve the at least one first historical problem based on the at least one new algorithm respectively, obtain at least one first solution corresponding to the at least one first historical problem, and determine the evaluation index of each of the at least one new algorithm.
[0165] Optionally, the acquisition module 410 is further configured to receive at least one second historical problem; the acquisition module 410 is further configured to solve the at least one second historical problem based on the at least one historical algorithm respectively, obtain at least one second solution corresponding to each of the at least one second historical problem, and determine the evaluation index of each of the at least one historical algorithm.
[0166] Optionally, the screening module 440 is specifically used to: select the algorithm set from the at least one historical algorithm and the at least one new algorithm based on the evaluation index corresponding to each of the at least one new algorithm and the evaluation index corresponding to each of the at least one historical algorithm, based on the mechanism of population evolution.
[0167] Optionally, the context may also include template information of the at least one historical algorithm, which includes at least one of the following: input information, output information, and decision content of the historical algorithm.
[0168] Optionally, the prompt is used to instruct the large language model to use the evaluation metrics of each of the at least one historical algorithm as supervision information to generate the at least one new algorithm; the generation module 420 is specifically used to: input the context information and the prompt information into the large language model, and the large language model outputs the at least one new algorithm.
[0169] Optionally, the at least one historical algorithm and the at least one new algorithm have the same function.
[0170] Optionally, the acquisition module 410 is also used to receive configuration data input by the user, which includes configuration parameters of the large language model.
[0171] The device 400 here can be embodied in the form of a functional module. The term "module" here can be implemented in software and / or hardware, without specific limitations.
[0172] For example, a "module" can be a software program, a hardware circuit, or a combination of both that implements the above functions. For instance, the implementation of module 410 will be described below using module 410 as an example. Similarly, the implementation of other modules, such as module 420, module 430, and module 440, can refer to the implementation of module 410.
[0173] As an example of a software functional unit, the acquisition module 410 may include code running on a computing instance. The computing instance may include at least one of a physical host (computing device), a virtual machine, or a container. Further, the aforementioned computing instance may be one or more. For example, the acquisition module 410 may include code running on multiple hosts / virtual machines / containers. It should be noted that the multiple hosts / virtual machines / containers used to run the code may be distributed in the same region or in different regions. Further, the multiple hosts / virtual machines / containers used to run the code may be distributed in the same availability zone (AZ) or in different AZs, each AZ including one or more geographically proximate data centers. Typically, a region may include multiple AZs.
[0174] Similarly, multiple hosts / virtual machines / containers used to run this code can be distributed within the same Virtual Private Cloud (VPC) or across multiple VPCs. Typically, a VPC is set up within a region. Communication between two VPCs within the same region, as well as between VPCs in different regions, requires a communication gateway to be set up within each VPC to enable interconnection between VPCs.
[0175] As an example of a hardware functional unit, the acquisition module 410 may include at least one computing device, such as a server. Alternatively, the acquisition module 410 may also be a device implemented using an application-specific integrated circuit (ASIC) or a programmable logic device (PLD). The PLD may be implemented using a complex programmable logical device (CPLD), a field-programmable gate array (FPGA), generic array logic (GAL), or any combination thereof.
[0176] The multiple computing devices included in the acquisition module 410 can be distributed in the same region or in different regions. Similarly, the multiple computing devices included in the acquisition module 410 can be distributed in the same Availability Zone (AZ) or in different AZs. Likewise, the multiple computing devices included in the acquisition module 410 can be distributed in the same Virtual Private Cloud (VPC) or in multiple VPCs. These multiple computing devices can be any combination of computing devices such as servers, ASICs, PLDs, CPLDs, FPGAs, and GALs.
[0177] Therefore, the modules of the various examples described in the embodiments of this application can be implemented in electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are implemented in hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art can use different methods to implement the described functions for each specific application, but such implementation should not be considered beyond the scope of this application.
[0178] It should be noted that the above embodiments of the device, when executing the above methods, are only illustrative examples of the division of the above functional modules. In practical applications, the above functions can be assigned to different functional modules as needed, that is, the internal structure of the device can be divided into different functional modules to complete all or part of the functions described above. For example, the acquisition module 410 can be used to execute any step in the above methods, the generation module 420 can be used to execute any step in the above methods, the determination module 430 can be used to execute any step in the above methods, and the filtering module 440 can be used to execute any step in the above methods. The steps implemented by the acquisition module 410, generation module 420, determination module 430, and filtering module 440 can be specified as needed. By implementing different steps in the above methods through the acquisition module 410, generation module 420, determination module 430, and filtering module 440, all the functions of the above device can be realized.
[0179] Furthermore, the apparatus and method embodiments provided in the above embodiments belong to the same concept, and their specific implementation process can be found in the method embodiments above, which will not be repeated here.
[0180] Figure 5 is a schematic deployment diagram of a device for determining a solver according to an embodiment of this application. As shown in Figure 5, the device for determining a solver can be abstracted into a cloud service by a cloud service provider on a cloud management platform and provided to users. After the user purchases the cloud service on the cloud management platform, the cloud environment uses the cloud service to provide the user with the cloud service for determining a solver.
[0181] As shown in Figure 5, the solver described above can also be abstracted into a cloud service by the cloud service provider on the cloud management platform and provided to the user. After the user purchases the cloud service on the cloud management platform, the cloud environment uses the cloud service to provide the user with a cloud service that uses the solver to solve problems.
[0182] For example, a tenant logs into the cloud management platform via a pre-registered account and password on the public cloud access page. After successful login, the tenant selects and purchases a cloud service for a deterministic solver and / or a cloud service for solving problems using the solver. If the tenant purchases the cloud service for a deterministic solver, they can use the functions provided by that service to perform the deterministic solver operation. If the tenant purchases the cloud service for solving problems using the solver, they can use the functions provided by that service to solve the problem.
[0183] For example, a cloud management platform primarily manages the infrastructure for cloud services that run deterministic solvers and cloud services that utilize the solvers. This infrastructure may include multiple data centers located in different regions, each containing multiple servers. Data centers can provide basic resources, such as computing resources and storage resources, for the cloud services that run deterministic solvers and cloud services that utilize the solvers. Therefore, when tenants purchase and use cloud services that run deterministic solvers and / or cloud services that utilize the solvers, they primarily pay for the resources they use.
[0184] As shown in Figure 5, taking the purchase of a deterministic solver cloud service by a tenant as an example, the user can upload at least one of the following information to the cloud environment through an application program interface (API) or a web interface provided by the cloud management platform: at least one first historical problem and at least one second historical problem. The deterministic solver cloud service calls the deterministic solver's device to determine at least one target algorithm, which can be embedded in the solver, for example, by updating a dynamic link library file.
[0185] As shown in Figure 5, taking a tenant's purchase of a cloud service that uses a solver for problem solving as an example, the user can upload the problem to be solved to the cloud environment via API or through the web interface provided by the cloud management platform. The cloud service that uses the solver will then call the solver to solve the problem and obtain the solution. The solution can then be returned to the user's terminal device through the cloud environment.
[0186] The aforementioned terminal devices can be mobile phones, laptops, tablets, handheld computers, wireless terminals in smart cities, wireless devices in smart homes, etc.
[0187] In some embodiments, users can also upload configuration parameters through the configuration interface or API on the public cloud access page provided by the cloud management platform. These configuration parameters may include, but are not limited to, at least one of the following: the architecture of the large language model, the parameters of the large language model, the temperature parameter of the large language model (a hyperparameter in the large language model that affects the probability distribution of the output results of the large language model), the number of algorithm evolution iterations required to obtain at least one target algorithm, the key decision algorithm selected, and the types of algorithm evaluation metrics.
[0188] In this embodiment of the application, when the device for determining the solver is a software device, the device can be deployed on a computing device in any environment, or on a computing device cluster consisting of multiple computing devices in any environment.
[0189] The aforementioned computing device can also be referred to as a computer system. It includes a hardware layer, an operating system layer running on top of the hardware layer, and an application layer running on top of the operating system layer. The hardware layer includes hardware such as processing units, memory, and memory control units; the functions and structure of this hardware will be described in detail later. The operating system can be any one or more computer operating systems that implement business processing through processes, such as Linux, Unix, Android, iOS, or Windows. The application layer includes applications such as browsers, address books, word processing software, and instant messaging software. Optionally, the computer system can be a handheld device such as a smartphone, or a terminal device such as a personal computer; this application does not specifically limit this, as long as the method provided in the embodiments of this application can be used. The execution subject of the method provided in the embodiments of this application can be a computing device, or a functional module within the computing device capable of calling and executing programs.
[0190] The following describes in detail a computing device provided in an embodiment of this application, with reference to Figure 6.
[0191] Figure 6 is a schematic diagram of the architecture of a computing device 1500 provided in an embodiment of this application. The computing device 1500 may be a server, a computer, or other device with computing capabilities. The computing device 1500 shown in Figure 6 includes at least one processor 1510 and a memory 1520.
[0192] It should be understood that this application does not limit the number of processors and memories in the computing device 1500.
[0193] The processor 1510 executes instructions in the memory 1520, causing the computing device 1500 to implement the method provided in this application. Alternatively, the processor 1510 executes instructions in the memory 1520, causing the computing device 1500 to implement the various functional modules provided in this application, thereby implementing the method provided in this application.
[0194] Optionally, the computing device 1500 also includes a communication interface 1530. The communication interface 1530 uses a transceiver module, such as, but not limited to, a network interface card or a transceiver, to enable communication between the computing device 1500 and other devices or communication networks.
[0195] Optionally, the computing device 1500 also includes a system bus 1540, wherein the processor 1510, memory 1520, and communication interface 1530 are respectively connected to the system bus 1540. The processor 1510 can access the memory 1520 through the system bus 1540; for example, the processor 1510 can perform data read / write or code execution in the memory 1520 through the system bus 1540. The system bus 1540 is a peripheral component interconnect express (PCI) bus or an extended industry standard architecture (EISA) bus, etc. The system bus 1540 is divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is used in Figure 6, but this does not mean that there is only one bus or one type of bus.
[0196] In one possible implementation, the processor 1510 primarily functions to interpret the instructions (or code) of a computer program and process data within the computer software. The instructions of the computer program and the data within the computer software can be stored in memory 1520 or cache 1516.
[0197] Optionally, processor 1510 may be an integrated circuit chip with signal processing capabilities. By way of example and not limitation, processor 1510 may be a general-purpose processor, a digital signal processor (DSP), an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA), or other programmable logic devices, discrete gate or transistor logic devices, or discrete hardware components. Among these, a general-purpose processor is a microprocessor, etc. For example, processor 1510 may be a central processing unit (CPU).
[0198] Optionally, each processor 1510 includes at least one processing unit 1512 and a memory control unit 1514.
[0199] Optionally, the processing unit 1512, also known as the core, is the most important component of the processor. The processing unit 1512 is manufactured from single-crystal silicon using a specific production process. All calculations, command reception, command storage, and data processing are performed by the core. Each processing unit independently executes program instructions, utilizing parallel computing capabilities to accelerate program execution. Various processing units have fixed logical structures; for example, a processing unit includes logical units such as a Level 1 cache, a Level 2 cache, an execution unit, an instruction-level unit, and a bus interface.
[0200] In one implementation example, the memory control unit 1514 controls the data interaction between the memory 1520 and the processing unit 1512. Specifically, the memory control unit 1514 receives memory access requests from the processing unit 1512 and controls access to memory based on the memory access requests. By way of example and not limitation, the memory control unit is a device such as a memory management unit (MMU).
[0201] In one implementation example, each memory control unit 1514 addresses the memory 1520 via the system bus. An arbitrator (not shown in Figure 6) is configured on the system bus to handle and coordinate contention for access by multiple processing units 1512.
[0202] In one implementation example, the processing unit 1512 and the memory control unit 1514 are connected via internal chip connection lines, such as address lines, thereby enabling communication between the processing unit 1512 and the memory control unit 1514.
[0203] Optionally, each processor 1510 also includes a cache 1516, which is a buffer for data exchange (called a cache). When the processing unit 1512 needs to read data, it first looks for the required data in the cache. If the data is found, it is executed directly; otherwise, it looks for the data in memory. Since the cache operates much faster than memory, its purpose is to help the processing unit 1512 run faster.
[0204] The memory 1520 provides runtime space for processes in the computing device 1500. For example, the memory 1520 stores the computer program (specifically, the program code) used to generate the process. After the computer program is run by the processor to generate a process, the processor allocates corresponding storage space for the process in the memory 1520. Furthermore, the aforementioned storage space further includes text segments, initialized data segments, bit initialized data segments, stack segments, heap segments, etc. The memory 1520 stores data generated during the process's execution, such as intermediate data or process data, in the aforementioned process-specific storage space.
[0205] Optionally, the memory, also known as RAM, is used to temporarily store the data processed by the processor 1510, as well as data exchanged with external storage devices such as hard disks. As long as the computer is running, the processor 1510 will load the data that needs to be processed into RAM for processing, and after the processing is completed, the processing unit 1512 will send the result out.
[0206] By way of example and not limitation, memory 1520 is volatile memory or non-volatile memory, or may include both volatile and non-volatile memory. Non-volatile memory is read-only memory (ROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), or flash memory. Volatile memory is random access memory (RAM) used as an external cache. By way of example, but not limitation, many forms of RAM are available, such as static random access memory (SRAM), dynamic random access memory (DRAM), synchronous dynamic random access memory (SDRAM), double data rate synchronous dynamic random access memory (DDR SDRAM), enhanced synchronous dynamic random access memory (ESDRAM), synchronous linked dynamic random access memory (SLDRAM), and direct rambus RAM (DR RAM). It should be noted that the memory 1520 of the systems and methods described herein is intended to include, but is not limited to, these and any other suitable types of memory.
[0207] The structure of the computing device 1500 listed above is merely illustrative and is not limited thereto. The computing device 1500 in this application includes various hardware components in existing computer systems. For example, the computing device 1500 also includes other memories besides memory 1520, such as disk storage. Those skilled in the art should understand that the computing device 1500 may also include other devices necessary for normal operation. Furthermore, depending on specific needs, those skilled in the art should understand that the computing device 1500 may also include hardware devices for implementing other additional functions. In addition, those skilled in the art should understand that the computing device 1500 may only include the devices necessary for implementing the embodiments of this application, and not necessarily all the devices shown in FIG. 6.
[0208] This application also provides a computing device cluster. The computing device cluster includes at least one computing device. The computing device may be a server. In some embodiments, the computing device may also be a desktop computer, a laptop computer, or a smartphone, or other terminal device.
[0209] As shown in Figure 7, the computing device cluster includes at least one computing device 1500. The memory 1520 of one or more computing devices 1500 in the computing device cluster may store the same instructions for performing the above-described methods.
[0210] In some possible implementations, the memory 1520 of one or more computing devices 1500 in the computing device cluster may also each store a portion of the instructions for executing the above-described methods. In other words, a combination of one or more computing devices 1500 can jointly execute the instructions of the above-described methods.
[0211] It should be noted that the memory 1520 in different computing devices 1500 within the computing device cluster can store different instructions, each used to execute a portion of the functions of the aforementioned device. That is, the instructions stored in the memory 1520 of different computing devices 1500 can implement the functions of one or more modules within the aforementioned device.
[0212] In some possible implementations, one or more computing devices in a computing device cluster can be connected via a network. This network can be a wide area network (WAN) or a local area network (LAN), etc. Figure 8 illustrates one possible implementation. As shown in Figure 8, two computing devices, 1500A and 1500B, are connected via a network. Specifically, they are connected to the network through communication interfaces in each computing device.
[0213] It should be understood that the functions of computing device 1500A shown in Figure 8 can also be performed by multiple computing devices 1500. Similarly, the functions of computing device 1500B can also be performed by multiple computing devices 1500.
[0214] In this embodiment, a computer program product containing instructions is also provided. The computer program product may be a software or program product containing instructions capable of running on a computing device or stored on any usable medium. When run on a computing device, it causes the computing device to perform the methods provided above, or causes the computing device to perform the functions of the apparatus provided above.
[0215] In this embodiment, a computer-readable storage medium is also provided. This computer-readable storage medium can be any available medium that a computing device can store, or a data storage device such as a data center containing one or more available media. The available medium can be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid-state drive). The computer-readable storage medium includes instructions that, when executed on a computing device, cause the computing device to perform the method described above.
[0216] It should be understood that in the various embodiments of this application, the order of the above-mentioned processes does not imply the order of execution. The execution order of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiments of this application.
[0217] Those skilled in the art will recognize that the units and algorithm steps of the various examples described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are implemented in hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art can use different methods to implement the described functions for each specific application, but such implementation should not be considered beyond the scope of this application.
[0218] Those skilled in the art will understand that, for the sake of convenience and brevity, the specific working processes of the systems, devices, and units described above can be referred to the corresponding processes in the foregoing method embodiments, and will not be repeated here.
[0219] In the several embodiments provided in this application, it should be understood that the disclosed systems, apparatuses, and methods can be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative; for instance, the division of units is only a logical functional division, and in actual implementation, there may be other division methods. For example, multiple units or components may be combined or integrated into another system, or some features may be ignored or not executed. Furthermore, the coupling or direct coupling or communication connection shown or discussed may be through some interfaces; the indirect coupling or communication connection between apparatuses or units may be electrical, mechanical, or other forms.
[0220] The units described as separate components may or may not be physically separate. The components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the units can be selected to achieve the purpose of this embodiment according to actual needs.
[0221] In addition, the functional units in the various embodiments of this application can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit.
[0222] If the aforementioned functions are implemented as software functional units and sold or used as independent products, they can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of this application, in essence, or the part that contributes to the prior art, or a portion of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of this application. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.
[0223] The above description is merely a specific embodiment of this application, but the scope of protection of this application is not limited thereto. Any variations or substitutions that can be easily conceived by those skilled in the art within the scope of the technology disclosed in this application should be included within the scope of protection of this application. Therefore, the scope of protection of this application should be determined by the scope of the claims.
Claims
1. A method of determining a solver, characterized in that, The method includes: Obtain the context information of the prompt of the large language model, wherein the context includes at least one evaluation metric of each of the historical algorithms; Based on the prompt, use the large language model to output at least one new algorithm; Determine the evaluation metrics corresponding to each of the at least one new algorithm; Based on the evaluation metrics corresponding to each of the at least one new algorithm and the evaluation metrics corresponding to each of the at least one historical algorithm, an algorithm set is selected from the at least one historical algorithm and the at least one new algorithm. The algorithm set includes at least one target algorithm, which is used by the solver to solve the received problem to be solved in order to obtain the solution to the problem to be solved.
2. The method according to claim 1, characterized in that, The context includes the at least one historical algorithm and its respective evaluation metric.
3. The method according to claim 1 or 2, characterized in that, Determining the evaluation metrics corresponding to each of the at least one new algorithm includes: Receive at least one first history question; The at least one first historical problem is solved based on the at least one new algorithm to obtain at least one first solution corresponding to the at least one first historical problem, and the evaluation index of each of the at least one new algorithm is determined.
4. The method according to any one of claims 1 to 3, characterized in that, Before obtaining the context information of the prompt from the large language model, the method further includes: Receive at least one second history question; The at least one second historical problem is solved based on the at least one historical algorithm to obtain at least one second solution corresponding to each of the at least one second historical problem, and the evaluation index of each of the at least one historical algorithm is determined.
5. The method according to any one of claims 1 to 4, characterized in that, The step of selecting an algorithm set from the at least one historical algorithm and the at least one new algorithm based on the evaluation metrics corresponding to each of the at least one new algorithm and the evaluation metrics corresponding to each of the at least one historical algorithm includes: Based on the evaluation metrics corresponding to the at least one new algorithm and the at least one historical algorithm, the algorithm set is selected from the at least one historical algorithm and the at least one new algorithm based on the mechanism of population evolution.
6. The method according to any one of claims 1 to 5, characterized in that, The context also includes template information of the at least one historical algorithm, the template information including at least one of the following: input information, output information and decision content of the historical algorithm.
7. The method of claim 6, wherein, The prompt is used to instruct the large language model to use the evaluation metrics of each of the at least one historical algorithm as supervision information to generate the at least one new algorithm; The step of inputting the context information into a large language model to generate at least one new algorithm includes: The context information and the prompts are input into the large language model, and the large language model outputs the at least one new algorithm.
8. The method according to any one of claims 1 to 7, characterized in that, The at least one historical algorithm and the at least one new algorithm have the same function.
9. The method according to any one of claims 1 to 8, characterized in that, The method further includes: The system receives configuration data input by the user, which includes configuration parameters of the large language model.
10. An apparatus for determining a solver, the apparatus comprising: The device includes: The acquisition module is used to acquire the context information of the prompt of the large language model, wherein the context includes at least one evaluation metric of each of the historical algorithms. A generation module is used to output at least one new algorithm based on the prompt and the large language model. A determination module is used to determine the evaluation index corresponding to each of the at least one new algorithm; A filtering module is used to filter an algorithm set from the at least one historical algorithm and the at least one new algorithm according to the evaluation index corresponding to each of the at least one new algorithm and the evaluation index corresponding to each of the at least one historical algorithm. The algorithm set includes at least one target algorithm, which is used by the solver to solve the received problem to be solved in order to obtain the solution to the problem to be solved.
11. The apparatus of claim 10, wherein, The context includes the at least one historical algorithm and its respective evaluation metric.
12. The apparatus of claim 10 or 11, wherein, The determining module is specifically used for: Receive at least one first history question; The at least one first historical problem is solved based on the at least one new algorithm to obtain at least one first solution corresponding to the at least one first historical problem, and the evaluation index of each of the at least one new algorithm is determined.
13. The apparatus according to any one of claims 10 to 12, characterized in that, The acquisition module is also used to receive at least one second historical question; The acquisition module is further configured to solve the at least one second historical problem based on the at least one historical algorithm, obtain at least one second solution corresponding to each of the at least one second historical problem, and determine the evaluation index of each of the at least one historical algorithm.
14. The apparatus according to any one of claims 10 to 13, characterized in that, The filtering module is specifically used for: Based on the evaluation metrics corresponding to the at least one new algorithm and the at least one historical algorithm, the algorithm set is selected from the at least one historical algorithm and the at least one new algorithm based on the mechanism of population evolution.
15. The apparatus according to any one of claims 10 to 14, characterized in that, The context also includes template information of the at least one historical algorithm, the template information including at least one of the following: input information, output information and decision content of the historical algorithm.
16. The apparatus of claim 15, wherein, The prompt is used to instruct the large language model to use the evaluation metrics of each of the at least one historical algorithm as supervision information to generate the at least one new algorithm; The generation module is specifically used for: The context information and the prompts are input into the large language model, and the large language model outputs the at least one new algorithm.
17. The apparatus according to any one of claims 10 to 16, characterized in that, The at least one historical algorithm and the at least one new algorithm have the same function.
18. The apparatus according to any one of claims 10 to 17, characterized in that, The acquisition module is further configured to receive configuration data input by the user, the configuration data including configuration parameters of the large language model.
19. A cluster of computing devices, characterized in that, It includes at least one computing device, each computing device including a processor and memory; The processor of the at least one computing device is configured to execute instructions stored in the memory of the at least one computing device to cause the cluster of computing devices to perform the method as described in any one of claims 1 to 9.
20. A computer program product containing instructions, characterized in that, When the instruction is executed by the computing device cluster, the computing device cluster performs the method as described in any one of claims 1 to 9.
21. A computer-readable storage medium, characterized in that, It includes computer program instructions, which, when executed by a cluster of computing devices, perform the method as described in any one of claims 1 to 9.