Multi-agent collaborative hotel recommendation method, hotel recommendation device, electronic device and storage medium

By replacing the traditional multi-level independent processing flow with a multi-agent collaborative architecture, the recommendation system is simplified and globally optimized, improving the accuracy and consistency of personalized recommendations and reducing system complexity and maintenance costs.

CN122309837APending Publication Date: 2026-06-30BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD
Filing Date
2026-03-17
Publication Date
2026-06-30

AI Technical Summary

Technical Problem

Existing recommendation systems are complex in structure, have lengthy processing links, lack an overall collaborative optimization mechanism, make it difficult to achieve personalized recommendations, and rely on manual rules, resulting in high maintenance costs and suboptimal recommendation effects.

Method used

A multi-agent collaborative architecture is adopted, in which the first agent recalls, the second agent evaluates and provides feedback, and the third agent ranks, realizing an end-to-end recommendation process. All agents work together to optimize around the same optimization goal.

Benefits of technology

Simplify system architecture, reduce maintenance costs, improve recommendation consistency and personalization capabilities, and generate accurate recommendation results that meet users' personalized needs.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122309837A_ABST
    Figure CN122309837A_ABST
Patent Text Reader

Abstract

This disclosure relates to search recommendation, multi-agent collaboration, and mapping, particularly to a hotel recommendation method, device, electronic device, and storage medium based on multi-agent collaboration. The specific implementation involves: a first agent recalling candidate hotels based on acquired user profile information, environmental information, and user search intent; a second agent evaluating the candidate hotel set, and if the set does not meet user needs, sending feedback to the first agent, triggering the first agent to regenerate the candidate hotel set until it meets user needs; and a third agent ranking the candidate hotel set to generate a recommendation result. This disclosure continuously optimizes the recommendation result through the interaction and feedback mechanism between agents, automatically generating recommendation results that meet user personalized preferences and scenario requirements without manual rule intervention.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This disclosure relates to search recommendation, multi-agent collaboration, and mapping, and in particular to a hotel recommendation method, hotel recommendation device, electronic device, and storage medium for multi-agent collaboration. Background Technology

[0002] Existing recommendation systems typically employ a phased, multi-level processing architecture. When a user initiates a query, the system sequentially goes through multiple processing stages, including content recall, coarse ranking, fine ranking, and re-ranking, filtering and sorting candidate results layer by layer to obtain the final recommendation. This "recall-coarse ranking-fine ranking-re-ranking" approach is essentially a funnel structure, with each stage relatively independent and usually implemented by different models or strategies, resulting in a lack of an overall collaborative optimization mechanism. In practical applications, the output of this process often requires adjustments based on numerous manual rules to meet specific business needs, further increasing system complexity and maintenance costs.

[0003] Specifically, traditional recommendation systems suffer from the following shortcomings: First, the superposition of models at various levels and manual rules makes the system structure complex and the processing chain lengthy. There is a lack of overall coordination between stages, making it difficult to optimize the overall recommendation effect globally, and the system maintenance and iteration costs are high. Second, the system is not an end-to-end recommendation model. The recommendation results are highly dependent on manual design and rule intervention, making it difficult to fully model users' personalized preferences and complex scenario information, thus limiting further improvement in recommendation effect. Third, in the multi-level architecture, different stages often have different optimization objectives. For example, the recall stage focuses on recall rate, while the ranking stage focuses on metrics such as click-through rate and conversion rate. Inconsistent objectives can easily lead to optimization conflicts, making it difficult for local optima to be transmitted to global optima. In practical applications, only suboptimal recommendation results are often obtained.

[0004] Therefore, there is an urgent need for a hotel recommendation method that can simplify the architecture of recommendation systems, reduce reliance on manual rules, improve the global optimization capabilities of recommendation systems, and achieve personalized recommendations. Summary of the Invention

[0005] This disclosure provides a hotel recommendation method, a hotel recommendation device, an electronic device, and a storage medium for multi-agent collaboration.

[0006] According to one aspect of this disclosure, a multi-agent collaborative hotel recommendation method is provided, comprising: In response to a user's hotel search request, determine the user's search intent based on the hotel search request; The first intelligent agent retrieves candidate hotels based on the acquired user profile information, environmental information, and the user's search intent. The second intelligent agent evaluates the candidate hotel set based on the user profile information and the environmental information to determine whether the candidate hotel set meets the user's needs. In response to the fact that the candidate hotel set does not meet the user's needs, the second intelligent agent sends feedback information to the first intelligent agent, triggering the first intelligent agent to regenerate the candidate hotel set according to the feedback information, until the candidate hotel set meets the user's needs; In response to the fact that the candidate hotel set meets the user's needs, a third-party intelligent agent sorts the candidate hotel set according to the user profile information and hotel details, and generates a recommendation result.

[0007] According to another aspect of this disclosure, a multi-agent collaborative hotel recommendation device is provided, comprising: The intent determination module is configured to determine the user's search intent based on the user's hotel search request in response to the hotel search request. The recall module is configured to use a first intelligent agent to recall candidate hotels based on the acquired user profile information, environmental information, and the user's search intent. The evaluation module is configured to evaluate the candidate hotel set based on the user profile information and the environmental information through a second intelligent agent, so as to determine whether the candidate hotel set meets the user's needs; The interaction module is configured to, in response to the fact that the candidate hotel set does not meet the user's needs, send feedback information to the first intelligent agent through the second intelligent agent, triggering the first intelligent agent to regenerate the candidate hotel set according to the feedback information, until the candidate hotel set meets the user's needs; The sorting module is configured to, in response to the candidate hotel set meeting the user's needs, sort the candidate hotel set according to the user profile information and hotel details through a third intelligent agent, and generate recommendation results.

[0008] According to a third aspect of this disclosure, an electronic device is provided, comprising: At least one processor; and A memory communicatively connected to the at least one processor; wherein, The memory stores instructions that can be executed by the at least one processor, which, when executed by the at least one processor, enables the at least one processor to perform the method described in any of the above technical solutions.

[0009] According to a fourth aspect of this disclosure, a non-transitory computer-readable storage medium is provided storing computer instructions, wherein the computer instructions are used to cause the computer to perform any one of the methods described above.

[0010] According to a fifth aspect of this disclosure, a computer program product is provided, comprising a computer program that, when executed by a processor, implements the method described in any one of the above technical solutions.

[0011] This disclosure replaces the traditional multi-level independent processing flow by constructing a multi-agent collaborative architecture. The agents continuously optimize recommendation results through interaction and feedback mechanisms, enabling the recommendation system to automatically generate recommendations that match user preferences and scenario requirements without manual rule intervention. This not only effectively simplifies the overall architecture of the recommendation system and reduces system maintenance and iteration costs, but also improves the overall consistency of the recommendation process and the personalization of recommendation results, thereby enhancing the user experience.

[0012] It should be understood that the description in this section is not intended to identify key or essential features of the embodiments of this disclosure, nor is it intended to limit the scope of this disclosure. Other features of this disclosure will become readily apparent from the following description. Attached Figure Description

[0013] The accompanying drawings are provided to better understand this solution and do not constitute a limitation of this disclosure. Wherein: Figure 1 This is a flowchart of the steps of the multi-agent collaborative hotel recommendation method according to the embodiments of this disclosure; Figure 2 This is an example diagram of the user interface of the hotel recommendation method for multi-agent collaboration according to the embodiments of this disclosure; Figure 3 This is a recommendation system architecture diagram of the hotel recommendation method for multi-agent collaboration used in the embodiments of this disclosure; Figure 4 This is a schematic diagram of the structure of a multi-agent collaborative hotel recommendation device according to an embodiment of this disclosure; Figure 5 This is a block diagram of an electronic device used to implement the multi-agent collaborative hotel recommendation method in the embodiments of this disclosure. Detailed Implementation

[0014] The exemplary embodiments of this disclosure are described below with reference to the accompanying drawings, including various details of the embodiments to aid understanding, and should be considered merely exemplary. Therefore, those skilled in the art will recognize that various changes and modifications can be made to the embodiments described herein without departing from the scope and spirit of this disclosure. Similarly, for clarity and brevity, descriptions of well-known functions and structures are omitted in the following description.

[0015] Existing search and recommendation systems typically employ a phased, multi-level processing architecture to respond to user queries. Specifically, when a user initiates a query, the system first parses the query information and, based on the parsing results, performs a content recall operation from the full pool of candidate content to obtain a set of candidate content associated with the query. Since the number of candidate content obtained in the recall phase is usually large and difficult to directly display as recommendation results, existing technologies typically include coarse-ranking and fine-ranking stages. The coarse-ranking stage generally uses models or rules with low computational complexity to initially filter and rank the candidate content set output from the recall phase, reducing the size of the candidate content set. Subsequently, the system inputs the coarse-ranked candidate content set into the fine-ranking stage. The fine-ranking stage typically uses a more complex and feature-rich ranking model to refine the ranking of the candidate content set, obtaining ranking results that align with user interests and business objectives. In some application scenarios, to improve the diversity of recommendation results or meet specific display strategy requirements, a re-ranking stage is set after the fine-ranking stage to further adjust the ranking results. The content results processed after the re-ranking stage constitute the main output of the recommendation chain. Furthermore, in actual online applications, influenced by changes in business strategies, operational needs, or scenarios at different times, existing recommendation systems often need to introduce manually defined rules to further adjust the recommendation results based on the output of the above model chain, ultimately forming the set of recommended content displayed to the user. Therefore, existing search recommendation systems typically form a multi-level processing flow of "recall—coarse ranking—fine ranking—re-ranking—manual rule intervention".

[0016] While the above recommendation schemes have been widely adopted in practice, they still have the following shortcomings: First, existing recommendation systems typically consist of multiple independent models and a large number of manually created rules, resulting in complex system structures, lengthy processing links, and a lack of unified collaborative optimization mechanisms between stages. This makes it difficult to optimize the system globally, and the system maintenance, debugging, and iteration costs are high. Second, the above recommendation methods are not end-to-end recommendation models. The recommendation results largely depend on manually designed feature engineering and rule intervention, making it difficult to uniformly model users' personalized preference information, contextual information, and their complex relationships, thus limiting further improvement in recommendation performance. Third, in multi-level processing architectures, different stages typically optimize for different local objectives. For example, the recall stage focuses on recall rate, while the ranking stage focuses on click-through rate, conversion rate, and other metrics. The inconsistent optimization objectives of each stage can easily lead to optimization conflicts, making it difficult for locally optimal results to be transmitted to the overall optimal results. In practical applications, only suboptimal recommendation performance is often achieved.

[0017] To address the aforementioned technical problems in existing recommendation methods, this disclosure provides a multi-agent collaborative hotel recommendation method, such as... Figure 1 As shown, it includes: Step S101: In response to the user's hotel search request, determine the user's search intent based on the hotel search request.

[0018] Figure 2 The diagram shows an example of the user interface of the hotel recommendation system in this embodiment. Users can enter relevant query information in the search bar 201 to generate a hotel search request. For example, the query information can be information that indicates the user's search intent, such as "hotel", "family hotel", "business hotel" or "Bund hotel". This information is used to identify the various types of queries that the user wants to query (such as regional search, central search, etc.).

[0019] Step S102: The first intelligent agent recalls candidate hotels based on the acquired user profile information, environmental information, and user search intent.

[0020] In this embodiment, user profile information is used to describe the user's long-term preference characteristics in hotel selection. Environmental information is used to describe the overall distribution of hotels in the user's current area. This environmental information includes, but is not limited to, the distribution of the number of hotels of different brands and star ratings around the user. By introducing the above environmental information, it is equivalent to providing the first intelligent agent (Planner Agent) with an environmental map reflecting the current hotel distribution status, enabling it to not only consider user preferences but also make comprehensive judgments based on the objective environment when recalling candidate hotels, thus possessing a certain degree of environmental awareness. This embodiment uses the first intelligent agent to perform multi-dimensional fusion recall by integrating user profile information, environmental information, and the current user search intent, enabling the recall stage to accurately capture the user's potential preferences and current scenario needs, improve the matching degree between the candidate hotel set and the user's true intent, and provide a high-quality data foundation for the subsequent evaluation and ranking stages.

[0021] Step S103: The second intelligent agent evaluates the candidate hotel set based on user profile information and environmental information to determine whether the candidate hotel set meets the user's needs.

[0022] Specifically, the candidate hotel set output by the first intelligent agent will be sent to the second intelligent agent (EvaluatorAgent). The second intelligent agent will combine user profile information and environmental information to evaluate whether the candidate results are reasonable, and execute different interaction processes based on the evaluation results.

[0023] In step S104, in response to the fact that the candidate hotel set does not meet the user's needs, the second intelligent agent sends feedback information to the first intelligent agent, triggering the first intelligent agent to regenerate the candidate hotel set based on the feedback information until the candidate hotel set meets the user's needs.

[0024] If the evaluation results of the second intelligent agent do not meet the user's needs—for example, if the second intelligent agent finds that the candidate hotel set includes hotels with prices exceeding the user's historical affordability—it will send a feedback message to the first intelligent agent stating, "The user is price-sensitive; it is recommended to filter hotels under 300 yuan." This triggers the first intelligent agent to re-recall hotels within that price range until the generated candidate set fully or largely meets the user's budget preferences. Through dynamic interaction and feedback calibration among multiple intelligent agents, the candidate hotel set can be continuously iterated and optimized without manual rule intervention until it accurately matches the user's personalized needs.

[0025] Step S105: In response to the fact that the candidate hotel set meets the user's needs, the third-party intelligent agent sorts the candidate hotel set according to the user profile information and hotel details, and generates recommendation results.

[0026] If the evaluation results of the second intelligent agent meet the user's needs, the candidate hotel set is directly entered into the third intelligent agent (Ranker Agent) for ranking. For example, the candidate hotel set includes 100 candidate hotels, including hotels such as "Hotel A Guomao Branch (280 yuan / night)" and "Hotel B Guomao Branch (320 yuan / night)" that match the user's budget. The third intelligent agent selects the 10 candidate hotels that best match the current query as the final recommended hotels based on the user profile information (preference for business hotels, proximity to the subway) and hotel details (Hotel A is 200 meters from the subway station, Hotel B is 600 meters from the subway station). Furthermore, based on the degree of match with the user's preferences, "Hotel A Guomao Branch" is ranked first.

[0027] also, Figure 3 The diagram illustrates an example architecture of the recommender system disclosed herein, which includes a first agent 301, a second agent 302, a third agent 303, a fourth agent 304, a query parser 305, and a recall tool 306. Each agent in this disclosure can employ any open-source Large Language Model (LLM), with a preference for lightweight models such as the ERNIE series (ERNIE-4.5-0.3B) and the Qwen series (Qwen3.5-0.8B). This disclosure employs multi-agent joint training, enabling all agents to collaboratively optimize around the same global objective, achieving end-to-end results. This means that the entire link from user input to the final recommendation result is learned and improved collaboratively. Compared to traditional recommender system architectures, where each stage requires independent training and optimization objectives are fragmented (e.g., recall focuses on recall rate, ranking focuses on click-through rate), it is difficult to achieve global optimization across the entire link, often only achieving locally optimal rather than globally optimal recommendation results. Joint training allows all modules to share the same optimization signal, ensuring that each step in the decision-making chain optimizes in the same direction, thereby improving the overall recommendation effect.

[0028] This disclosure constructs a multi-agent collaborative architecture in the recommendation system through the above technical solution, achieving the following technical effects: First, it simplifies the recommendation system architecture and reduces maintenance costs. This disclosure replaces the traditional multi-level independent processing flow of "recall—coarse ranking—fine ranking—re-ranking—manual rules" with multiple clearly defined agents, integrating the scattered independent modules into a unified collaborative framework. The system no longer needs to maintain multiple models and the manual rules required for each model separately, significantly reducing the complexity and iteration cost of the recommendation system. Second, it achieves global optimization and improves recommendation consistency. Different agents form a decision-making closed loop through interactive feedback mechanisms. All processing links work collaboratively around the same optimization goal (i.e., the final recommendation result), avoiding optimization deviations caused by conflicting goals at different stages in the traditional architecture, and enabling local optima to be effectively transmitted to global optima. Third, it enhances personalization capabilities and improves user experience. The system can uniformly model user queries, user profile information, and environmental information, and dynamically calibrate recommendation results in multiple rounds of interaction between agents. It can automatically generate the final ranking that meets the user's personalized needs without the intervention of manual rules, and the accuracy and interpretability of the recommendation results are improved simultaneously.

[0029] As an optional implementation, step S101, determining the user's search intent based on the hotel search request, includes: Hotel search requests are parsed to identify the search type as the user's search intent; the search type includes regional search and / or central search.

[0030] Figure 3 The diagram illustrates an example architecture of the recommendation system disclosed herein. When a user initiates a hotel search request, the request contains query information. First, the query is parsed by a query parser 305 to identify various query types (such as regional search, central search, etc.), i.e., to determine what type of hotel the user intends to search for. If the user searches for "hotels near Guomao," the query parser will identify this as a "regional search" (the user is concerned with geographical location); if the user searches for "hotels on Wangfujing Pedestrian Street," the parser will identify this as a "central search" (the user is concerned with a specific landmark or central point); if the user only searches for "hotels in Beijing," it may be identified as a "city-level search." After determining the user's search intent, the Planner Agent knows the user's core intent, thus more accurately deciding which tools to invoke and which factors to prioritize, thereby executing a reasonable recall strategy.

[0031] As an optional implementation, before step S102, which involves the first intelligent agent recalling candidate hotels based on the acquired user profile information, environmental information, and user search intent, the following steps are also included: The user's historical behavior data is input into a fourth intelligent agent to extract one or more user preference features as user profile information. This user profile information includes at least one of the following user preference features: geographic location preference, hotel attribute preference, and price preference.

[0032] In this embodiment, the fourth agent utilizes a large model with strong reasoning capabilities to analyze users' historical behavior data, generating high-quality preference inference samples. Based on these samples, a relatively lightweight model is fine-tuned to enable user preference modeling while meeting online latency requirements. For example, users' historical behavior can include various actions such as clicking, browsing, placing orders, and adding items to favorites, reflecting their preferences when choosing hotels. In this embodiment, the fourth agent (Memory Agent) performs in-depth mining and feature extraction on users' historical behavior data, automatically generating multi-dimensional user profile information including geographical location preferences, hotel attribute preferences, and price preferences. This embodiment uses the fourth agent to transform implicit user historical behavior into explicit user preference features, providing stable, reusable, high-quality user profile information for other agents, thereby reducing the system's reliance on manual rules and explicit feature engineering.

[0033] As an optional implementation, step S102 involves the first intelligent agent recalling candidate hotels based on the acquired user profile information, environmental information, and user search intent, including: The first intelligent agent determines the corresponding recall strategy based on user profile information, environmental information, and user search intent.

[0034] Multiple candidate hotels were recalled according to the recall strategy.

[0035] Select a preset number of candidate hotels from multiple candidate hotels to generate a candidate hotel set.

[0036] For example, user profile information includes a preference for budget chain hotels, price sensitivity, and a preference for hotels in the 300-500 yuan range; environmental information: there are 1000 hotels in the Guomao area, of which 680 are budget hotels and 530 are in the 300-500 yuan range; user search intent: regional search (near Guomao) + price considerations. The recall strategy decided by the first intelligent agent is to "prioritize recalling hotels within 3 kilometers of Guomao, priced between 260-560 yuan, and belonging to budget chain brands." In this embodiment, the first intelligent agent dynamically determines the recall strategy based on multi-dimensional information, executes the recall, and filters a preset number of candidate hotels, enabling the recall phase to accurately match users' personalized needs and control the number of candidate hotels.

[0037] Furthermore, as an optional implementation, the first intelligent agent determines the corresponding recall strategy based on user profile information, environmental information, and user search intent, including: The recall tools are determined based on user profile information, environmental information, and user search intent; each recall tool includes at least one of the following recall conditions: hotel price, hotel location, hotel star rating, and hotel type.

[0038] The order in which the recall tools are invoked is determined based on their priority.

[0039] Based on the determined recall tool and the invocation order, a corresponding recall strategy is generated.

[0040] Specifically, such as Figure 3 As shown, the recommendation system pre-sets multiple recall tools, each capable of using different recall criteria to recall candidate hotels, such as based on price, location, star rating, and hotel type. As an example, after comprehensive analysis, the first agent decides to invoke the following recall tools: the price tool, with a recall criterion of 200-600 yuan, which aligns with the user's price-sensitive profile; the location tool, with a recall criterion of within 3 kilometers of the Guomao area, which aligns with the user's search intent; the star rating tool, with a recall criterion of economy / comfort, which aligns with the user's preference for budget chain hotels; and the type tool, with a recall criterion of business hotels, which aligns with the user's preference.

[0041] The first agent decides which recall tools to use to search for hotels, then determines the priority of each tool, such as location being the highest priority, followed by price, star rating, and type. Finally, these are combined to form a complete recall strategy. As an example, the first agent determines the calling order based on the priority of the recall tools: the user's search intent of "near Guomao" clearly points to location, so the location tool has the highest priority; next is the price tool, because the user profile shows price sensitivity; finally, star rating and type are used as supplementary filters. The final calling order of the recall tools is: location tool → price tool → star rating tool → type tool. In this embodiment, the recall strategy is generated based on the determined recall tools and calling order, enabling the first agent to flexibly combine multi-dimensional recall conditions according to user profiles and search intent, improving the accuracy of the recall results and the interpretability of the strategy.

[0042] As an optional implementation, step S104, in response to the candidate hotel set not meeting the user's needs, sends feedback information to the first intelligent agent through the second intelligent agent, triggering the first intelligent agent to regenerate the candidate hotel set based on the feedback information, until the candidate hotel set meets the user's needs, further includes: A preset number of interaction rounds is set for the first and second intelligent agents. When the interaction between the first and second intelligent agents reaches the preset number of interaction rounds, the second intelligent agent stops sending feedback information to the first intelligent agent.

[0043] As an example, in the interaction between the first and second agents, the system pre-sets a maximum of 3 interaction rounds. The first agent initially recalls a set of candidate hotels, which the second agent evaluates as "too expensive and doesn't meet user preferences," and sends feedback. The first agent adjusts its recall strategy accordingly, generating a second set of candidate hotels, but the second agent still reports that "some hotels are located in remote areas." After the third round of adjustments, if the candidate set still doesn't fully meet the user's needs, the second agent stops sending feedback because the pre-set 3-round interaction limit has been reached, and the system proceeds to the subsequent sorting process. By pre-setting a maximum number of interaction rounds, the system avoids the first and second agents getting stuck in an infinite loop or engaging in ineffective interactions, ensuring that the system optimizes the candidate hotel set within a limited number of rounds, balancing recommendation effectiveness and system response efficiency, and improving overall processing performance.

[0044] As an optional implementation, step S104 involves the second intelligent agent sending feedback information to the first intelligent agent, triggering the first intelligent agent to regenerate the candidate hotel set based on the feedback information, including: The second agent generates a set of candidate hotels based on user profile information and environmental information, along with suggestions for adjustments to address any issues that do not meet the user's needs, and sends this feedback to the first agent.

[0045] The recall strategy of the first intelligent agent is adjusted based on the feedback information.

[0046] The first intelligent agent generates a set of candidate hotels based on the adjusted recall strategy.

[0047] For example, in the first round of recall, the first agent generates a candidate hotel set including "Hotel A" and "Hotel B". The second agent, based on user profile information (price-sensitive, preference for budget hotels) and environmental information (sufficient budget hotels in the Guomao area), evaluates and finds that "Hotel A" is priced at 680 yuan, exceeding the user's budget range. Therefore, the second agent generates feedback: "There is a problem with the price being too high in the candidate set (Hotel A). It is recommended to adjust the price ceiling to 500 yuan and filter budget hotels." Upon receiving the feedback, the first agent adjusts the price condition in the recall strategy from 200-800 yuan to 200-500 yuan, and re-recalls a new candidate hotel set that meets the conditions, including "Hotel A", "Hotel C", "Hotel D", and "Hotel E", for the second agent to conduct the next round of evaluation.

[0048] In the multi-agent collaborative architecture of this embodiment, the second agent not only identifies the problems in the candidate hotel set, but also provides specific adjustment suggestions, enabling the first agent to accurately correct the recall strategy, realize the targeted optimization of the candidate hotel set, and improve the efficiency of multi-agent collaboration and the accuracy of recommendation results.

[0049] As an optional implementation, step S104 involves a third-party intelligent agent sorting the candidate hotel set based on user profile information and hotel details to generate recommendation results, including: A third-party intelligent agent sorts the candidate hotel set based on user profile information and hotel details, and selects n candidate hotels as recommended hotels.

[0050] Based on user profile information and hotel details, a corresponding recommendation reason is generated for each recommended hotel, and the recommended hotel and the recommendation reason are used as the recommendation result.

[0051] For example, such as Figure 2 As shown, the third-party intelligent agent sorts hotels based on user profile information (family travel, preference for luxury hotels, emphasis on children's facilities) and detailed hotel information, selecting the top 3 hotels as recommended hotels: Hotel A ranked first, Hotel B ranked second, and Hotel C ranked third. Furthermore, the third-party intelligent agent generates personalized recommendation reasons for each hotel. For example, Hotel A's recommendation reason is "elegant environment, spacious rooms, extremely high hygiene and service standards, suitable for family stays, and allows early entry to the park, making it an excellent choice for easily enjoying Universal Studios"; Hotel B's recommendation reason is "has a 1200-square-meter kids' club, five-star childcare services, and regular clown shows and balloon parties, offering a variety of fun children's activities"; Hotel C's recommendation reason is "has a huge indoor children's playground, equipped with a super-large ball pit and slide combination, allowing children to play to their heart's content, and includes daily pet interaction, which is very popular with children." Finally, the sorted recommended hotels and their corresponding recommendation reasons are presented to the user as recommendation result 202.

[0052] In this embodiment, the processing of the third intelligent agent includes two stages: First, the third intelligent agent comprehensively sorts the candidate hotel set based on detailed hotel information and user profile information, selecting the hotels that best match the current query as the final recommended hotels. Second, the third intelligent agent analyzes the matching relationship between the final recommended hotels and user preferences based on their detailed hotel information, and generates a corresponding recommendation reason for each recommendation result. This recommendation reason explains the specific reasons for recommending the hotel to the user, such as price advantages, location advantages, etc. By introducing a recommendation reason generation mechanism in the sorting stage, based on the sorting of the candidate hotel set, not only is it ensured that the recommendation results deeply match the user's preferences, but it also avoids the problem of unexplainable recommendation logic in traditional recommendation systems, making the recommendation results more transparent and contributing to improved user experience.

[0053] This disclosure also provides a multi-agent collaborative hotel recommendation device 400, such as... Figure 4 As shown, it includes: The intent determination module 401 is configured to determine the user's search intent based on the hotel search request in response to the user's hotel search request.

[0054] Figure 2 The diagram shows an example of the user interface of the hotel recommendation system in this embodiment. Users can enter relevant query information in the search bar 201 to generate a hotel search request. For example, the query information can be information that indicates the user's search intent, such as "hotel", "family hotel", "business hotel" or "Bund hotel". This information is used to identify the various types of queries that the user wants to query (such as regional search, central search, etc.).

[0055] The recall module 402 is configured to use the first intelligent agent to recall candidate hotels based on the acquired user profile information, environmental information and user search intent.

[0056] In this embodiment, user profile information is used to describe the user's long-term preference characteristics in hotel selection. Environmental information is used to describe the overall distribution of hotels in the user's current area. This environmental information includes, but is not limited to, the distribution of the number of hotels of different brands and star ratings around the user. By introducing the above environmental information, it is equivalent to providing the first intelligent agent (Planner Agent) with an environmental map reflecting the current hotel distribution status, enabling it to not only consider user preferences but also make comprehensive judgments based on the objective environment when recalling candidate hotels, thus possessing a certain degree of environmental awareness. This embodiment uses the first intelligent agent to perform multi-dimensional fusion recall by integrating user profile information, environmental information, and the current user search intent, enabling the recall stage to accurately capture the user's potential preferences and current scenario needs, improve the matching degree between the candidate hotel set and the user's true intent, and provide a high-quality data foundation for the subsequent evaluation and ranking stages.

[0057] Evaluation module 403 is configured to evaluate the candidate hotel set based on user profile information and environmental information through a second intelligent agent in order to determine whether the candidate hotel set meets the user's needs.

[0058] Specifically, the candidate hotel set output by the first intelligent agent will be sent to the second intelligent agent (EvaluatorAgent). In the evaluation module 403, the second intelligent agent will combine user profile information and environmental information to evaluate whether the candidate results are reasonable, and execute different interaction processes based on the evaluation results.

[0059] Interaction module 404 is configured to respond to the fact that the candidate hotel set does not meet the user's needs by sending feedback information to the first intelligent agent through the second intelligent agent, triggering the first intelligent agent to regenerate the candidate hotel set based on the feedback information until the candidate hotel set meets the user's needs.

[0060] If the evaluation results of the second intelligent agent do not meet the user's needs—for example, if the second intelligent agent finds that the candidate hotel set includes hotels with prices exceeding the user's historical affordability—it will send a feedback message to the first intelligent agent stating, "The user is price-sensitive; it is recommended to filter hotels under 300 yuan." This triggers the first intelligent agent to re-recall hotels within that price range until the generated candidate set fully or largely meets the user's budget preferences. Through dynamic interaction and feedback calibration among multiple intelligent agents, the candidate hotel set can be continuously iterated and optimized without manual rule intervention until it accurately matches the user's personalized needs.

[0061] The sorting module 405 is configured to sort the candidate hotel set according to user profile information and hotel details through a third-party intelligent agent in response to the user's needs, and generate recommendation results.

[0062] If the evaluation results of the second intelligent agent meet the user's needs, the candidate hotel set is directly entered into the third intelligent agent (Ranker Agent) for ranking. For example, the candidate hotel set includes 100 candidate hotels, including hotels such as "Hotel A Guomao Branch (280 yuan / night)" and "Hotel B Guomao Branch (320 yuan / night)" that match the user's budget. The third intelligent agent selects the 10 candidate hotels that best match the current query as the final recommended hotels based on the user profile information (preference for business hotels, proximity to the subway) and hotel details (Hotel A is 200 meters from the subway station, Hotel B is 600 meters from the subway station). Furthermore, based on the degree of match with the user's preferences, "Hotel A Guomao Branch" is ranked first.

[0063] also, Figure 3The diagram illustrates an example architecture of the recommender system disclosed herein, which includes a first agent 301, a second agent 302, a third agent 303, a fourth agent 304, a query parser 305, and a recall tool 306. Each agent in this disclosure can employ any open-source Large Language Model (LLM), with a preference for lightweight models such as the ERNIE series (ERNIE-4.5-0.3B) and the Qwen series (Qwen3.5-0.8B). This disclosure employs multi-agent joint training, enabling all agents to collaboratively optimize around the same global objective, achieving end-to-end results. This means that the entire link from user input to the final recommendation result is learned and improved collaboratively. Compared to traditional recommender system architectures, where each stage requires independent training and optimization objectives are fragmented (e.g., recall focuses on recall rate, ranking focuses on click-through rate), it is difficult to achieve global optimization across the entire link, often only achieving locally optimal rather than globally optimal recommendation results. Joint training allows all modules to share the same optimization signal, ensuring that each step in the decision-making chain optimizes in the same direction, thereby improving the overall recommendation effect.

[0064] This disclosure achieves the following technical effects through the multi-agent collaborative architecture constructed in the aforementioned recommendation system: First, it simplifies the recommendation system architecture and reduces maintenance costs. This disclosure replaces the traditional multi-level independent processing flow of "recall—coarse ranking—fine ranking—re-ranking—manual rules" with multiple clearly defined agents, integrating scattered independent modules into a unified collaborative framework. The system no longer needs to maintain multiple models and the manual rules required for each model separately, significantly reducing the complexity and iteration cost of the recommendation system. Second, it achieves global optimization and improves recommendation consistency. Different agents form a decision-making closed loop through an interactive feedback mechanism. All processing links work collaboratively around the same optimization goal (i.e., the final recommendation result), avoiding optimization deviations caused by conflicting goals at different stages in the traditional architecture, and enabling local optima to be effectively transmitted to global optima. Third, it enhances personalization capabilities and improves user experience. The system can uniformly model user queries, user profile information, and environmental information, and dynamically calibrate recommendation results in multiple rounds of agent interaction. It can automatically generate a final ranking that meets the user's personalized needs without the intervention of manual rules, simultaneously improving the accuracy and interpretability of the recommendation results.

[0065] As an optional implementation, when the intent determination module 401 determines the user's search intent based on the hotel search request, it includes: Hotel search requests are parsed to identify the search type as the user's search intent; the search type includes regional search and / or central search.

[0066] Figure 3The diagram illustrates an example architecture of the recommendation system disclosed herein. When a user initiates a hotel search request, the request contains query information. First, the query is parsed by a query parser 305 to identify various query types (such as regional search, central search, etc.), i.e., to determine what type of hotel the user intends to search for. If the user searches for "hotels near Guomao," the query parser will identify this as a "regional search" (the user is concerned with geographical location); if the user searches for "hotels on Wangfujing Pedestrian Street," the parser will identify this as a "central search" (the user is concerned with a specific landmark or central point); if the user only searches for "hotels in Beijing," it may be identified as a "city-level search." After determining the user's search intent, the Planner Agent knows the user's core intent, thus more accurately deciding which tools to invoke and which factors to prioritize, thereby executing a reasonable recall strategy.

[0067] As an optional implementation, before the recall module 402 is used by the first intelligent agent to recall based on the acquired user profile information, environmental information, and the user's search intent to obtain a set of candidate hotels, it further includes: The profiling module is configured to input users' historical behavior data into a fourth intelligent agent to extract one or more user preference features as user profile information. This user profile information includes at least one user preference feature selected from geographic location preference, hotel attribute preference, and price preference.

[0068] In this embodiment, the fourth agent utilizes a large model with strong reasoning capabilities to analyze users' historical behavior data, generating high-quality preference inference samples. Based on these samples, a relatively lightweight model is fine-tuned to enable user preference modeling while meeting online latency requirements. For example, users' historical behavior can include various actions such as clicking, browsing, placing orders, and adding items to favorites, reflecting their preferences when choosing hotels. In this embodiment, the fourth agent (Memory Agent) performs in-depth mining and feature extraction on users' historical behavior data, automatically generating multi-dimensional user profile information including geographical location preferences, hotel attribute preferences, and price preferences. This embodiment uses the fourth agent to transform implicit user historical behavior into explicit user preference features, providing stable, reusable, high-quality user profile information for other agents, thereby reducing the system's reliance on manual rules and explicit feature engineering.

[0069] As an optional implementation, the recall module 402 is used to recall candidate hotels by the first intelligent agent based on the acquired user profile information, environmental information, and user search intent, including: The strategy determination unit is configured to determine the corresponding recall strategy through the first intelligent agent based on user profile information, environmental information, and user search intent.

[0070] The recall unit is configured to recall multiple candidate hotels according to a recall strategy.

[0071] The selection unit is configured to select a preset number of candidate hotels from multiple candidate hotels to generate a candidate hotel set.

[0072] For example, user profile information includes a preference for budget chain hotels, price sensitivity, and a preference for hotels in the 300-500 yuan range; environmental information: there are 1,000 hotels in the Guomao area, of which 680 are budget hotels and 530 are in the 300-500 yuan range; user search intent: regional search (near Guomao) + price consideration, and the recall strategy of the first intelligent agent is to "prioritize recalling hotels within 3 kilometers of Guomao, priced between 260-560 yuan, and belonging to budget chain brands".

[0073] As an optional implementation, when the strategy determination unit determines the corresponding recall strategy through the first intelligent agent based on user profile information, environmental information, and user search intent, it includes: The recall tools are determined based on user profile information, environmental information, and user search intent; each recall tool includes at least one of the following recall conditions: hotel price, hotel location, hotel star rating, and hotel type.

[0074] The order in which the recall tools are invoked is determined based on their priority.

[0075] Based on the determined recall tools and invocation order, a corresponding recall strategy is generated.

[0076] Specifically, such as Figure 3 As shown, the recommendation system pre-sets multiple recall tools, each capable of using different recall criteria to recall candidate hotels, such as based on price, location, star rating, and hotel type. As an example, after comprehensive analysis, the first agent decides to invoke the following recall tools: the price tool, with a recall criterion of 200-600 yuan, which aligns with the user's price-sensitive profile; the location tool, with a recall criterion of within 3 kilometers of the Guomao area, which aligns with the user's search intent; the star rating tool, with a recall criterion of economy / comfort, which aligns with the user's preference for budget chain hotels; and the type tool, with a recall criterion of business hotels, which aligns with the user's preference.

[0077] The first agent decides which recall tools to use to search for hotels, then determines the priority of each tool, such as location being the highest priority, followed by price, star rating, and type. Finally, these are combined to form a complete recall strategy. As an example, the first agent determines the calling order based on the priority of the recall tools: the user's search intent of "near Guomao" clearly points to location, so the location tool has the highest priority; next is the price tool, because the user profile shows price sensitivity; finally, star rating and type are used as supplementary filters. The final calling order of the recall tools is: location tool → price tool → star rating tool → type tool. In this embodiment, the recall strategy is generated based on the determined recall tools and calling order, enabling the first agent to flexibly combine multi-dimensional recall conditions according to user profiles and search intent, improving the accuracy of the recall results and the interpretability of the strategy.

[0078] As an optional implementation, the interaction module 404 is used to respond to the fact that the candidate hotel set does not meet the user's needs by sending feedback information to the first intelligent agent through the second intelligent agent, triggering the first intelligent agent to regenerate the candidate hotel set according to the feedback information, until the candidate hotel set meets the user's needs, and further includes: The setting module is configured to pre-set a preset number of interaction rounds between the first agent and the second agent, and to stop the second agent from sending feedback information to the first agent when the interaction between the first agent and the second agent reaches the preset number of interaction rounds.

[0079] As an example, in the interaction between the first and second agents, the system pre-sets a maximum of 3 interaction rounds. The first agent initially recalls a set of candidate hotels, which the second agent evaluates as "too expensive and doesn't meet user preferences," and sends feedback. The first agent adjusts its recall strategy accordingly, generating a second set of candidate hotels, but the second agent still reports that "some hotels are located in remote areas." After the third round of adjustments, if the candidate set still doesn't fully meet the user's needs, the second agent stops sending feedback because the pre-set 3-round interaction limit has been reached, and the system proceeds to the subsequent sorting process. By pre-setting a maximum number of interaction rounds, the system avoids the first and second agents getting stuck in an infinite loop or engaging in ineffective interactions, ensuring that the system optimizes the candidate hotel set within a limited number of rounds, balancing recommendation effectiveness and system response efficiency, and improving overall processing performance.

[0080] As an optional implementation, the interaction module 404 is used to send feedback information to the first intelligent agent through the second intelligent agent, triggering the first intelligent agent to regenerate the candidate hotel set based on the feedback information, including: The second agent generates a set of candidate hotels based on user profile information and environmental information, along with suggestions for adjustments to address any issues that do not meet the user's needs, and sends this feedback to the first agent.

[0081] The recall strategy of the first intelligent agent is adjusted based on the feedback information.

[0082] The first intelligent agent generates a set of candidate hotels based on the adjusted recall strategy.

[0083] For example, in the first round of recall, the first agent generates a candidate hotel set including "Hotel A" and "Hotel B". The second agent, based on user profile information (price-sensitive, preference for budget hotels) and environmental information (sufficient budget hotels in the Guomao area), evaluates and finds that "Hotel A" is priced at 680 yuan, exceeding the user's budget range. Therefore, the second agent generates feedback: "There is a problem with the price being too high in the candidate set (Hotel A). It is recommended to adjust the price ceiling to 500 yuan and filter budget hotels." Upon receiving the feedback, the first agent adjusts the price condition in the recall strategy from 200-800 yuan to 200-500 yuan, and re-recalls a new candidate hotel set that meets the conditions, including "Hotel A", "Hotel C", "Hotel D", and "Hotel E", for the second agent to conduct the next round of evaluation.

[0084] In the multi-agent collaborative architecture of this embodiment, the second agent not only identifies the problems in the candidate hotel set, but also provides specific adjustment suggestions, enabling the first agent to accurately correct the recall strategy, realize the targeted optimization of the candidate hotel set, and improve the efficiency of multi-agent collaboration and the accuracy of recommendation results.

[0085] As an optional implementation, the sorting module 405 is used to sort the candidate hotel set according to user profile information and hotel details by a third intelligent agent, and when generating recommendation results, it includes: The sorting unit is configured to sort the candidate hotel set based on user profile information and hotel details by a third intelligent agent, and select n candidate hotels as recommended hotels.

[0086] The recommendation reason generation unit is configured to generate a corresponding recommendation reason for each recommended hotel based on user profile information and hotel details, and then use the recommended hotel and recommendation reason as the recommendation result.

[0087] For example, such as Figure 2As shown, the third-party intelligent agent sorts hotels based on user profile information (family travel, preference for luxury hotels, emphasis on children's facilities) and detailed hotel information, selecting the top 3 as recommended hotels: Hotel A ranked first, Hotel B ranked second, and Hotel C ranked third. Furthermore, the third-party intelligent agent generates personalized recommendation reasons for each hotel. For example, Hotel A's recommendation reason is "elegant environment, spacious rooms, extremely high hygiene and service standards, suitable for family stays, and allows early entry to the park, making it an excellent choice for easily enjoying Universal Studios." Hotel B's recommendation reason is "has a 1200-square-meter kids' club, five-star childcare services, and regular clown shows and balloon parties, offering a variety of fun children's activities." Hotel C's recommendation reason is "has a huge indoor children's playground, equipped with a super-large ball pit and slide combination, allowing children to play to their heart's content, and includes daily pet interaction, which is very popular with children." Finally, the ranked recommended hotels and their corresponding recommendation reasons are presented to the user.

[0088] In this embodiment, the processing of the third intelligent agent includes two stages: In the first stage, the third intelligent agent comprehensively sorts the candidate hotel set based on detailed hotel information and user profile information, selecting the hotels that best match the current query as the final recommended hotels. In the second stage, the third intelligent agent analyzes the matching relationship between the detailed hotel information of the final recommended hotels and user preferences, and generates a corresponding recommendation reason for each recommendation result. The recommendation reason explains the specific reasons for recommending the hotel to the user, such as price advantage, location advantage, etc. By introducing a recommendation reason generation mechanism in the sorting stage, not only is it ensured that the recommendation results deeply match the user's preferences, but the problem of unexplainable recommendation logic in traditional recommendation systems is also avoided, making the recommendation results more transparent and helping to improve the user experience.

[0089] In the technical solution disclosed herein, the acquisition, storage, and application of any type of information, such as user personal information, comply with relevant laws and regulations, necessary confidentiality measures have been taken, and there is no violation of public order and good morals. In the technical solution disclosed herein, user authorization or consent has been obtained before acquiring or collecting user personal information.

[0090] According to embodiments of this disclosure, this disclosure also provides an electronic device, a readable storage medium, and a computer program product.

[0091] Figure 5A schematic block diagram of an example electronic device 500 that can be used to implement embodiments of the present disclosure is shown. The electronic device is intended to represent various forms of digital computers, such as laptop computers, desktop computers, workstations, personal digital assistants, servers, blade servers, mainframe computers, and other suitable computers. The electronic device may also represent various forms of mobile devices, such as personal digital processors, cellular phones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions are merely illustrative and are not intended to limit the implementation of the present disclosure described and / or claimed herein.

[0092] like Figure 5 As shown, device 500 includes a computing unit 501, which can perform various appropriate actions and processes based on a computer program stored in read-only memory (ROM) 502 or a computer program loaded from storage unit 508 into random access memory (RAM) 503. RAM 503 may also store various programs and data required for the operation of device 500. The computing unit 501, ROM 502, and RAM 503 are interconnected via bus 504. Input / output (I / O) interface 505 is also connected to bus 504.

[0093] Multiple components in device 500 are connected to I / O interface 505, including: input unit 506, such as keyboard, mouse, etc.; output unit 507, such as various types of monitors, speakers, etc.; storage unit 508, such as disk, optical disk, etc.; and communication unit 509, such as network card, modem, wireless transceiver, etc. Communication unit 509 allows device 500 to exchange information / data with other devices through computer networks such as the Internet and / or various telecommunications networks.

[0094] The computing unit 501 can be various general-purpose and / or special-purpose processing components with processing and computing capabilities. Some examples of the computing unit 501 include, but are not limited to, a central processing unit (CPU), a graphics processing unit (GPU), various special-purpose artificial intelligence (AI) computing chips, various computing units running machine learning model algorithms, a digital signal processor (DSP), and any suitable processor, controller, microcontroller, etc. The computing unit 501 performs the various methods and processes described above, such as a multi-agent collaborative hotel recommendation method. For example, in some embodiments, the multi-agent collaborative hotel recommendation method can be implemented as a computer software program tangibly contained in a machine-readable medium, such as storage unit 508. In some embodiments, part or all of the computer program can be loaded and / or installed on device 500 via ROM 502 and / or communication unit 509. When the computer program is loaded into RAM 503 and executed by the computing unit 501, one or more steps of the multi-agent collaborative hotel recommendation method described above can be performed. Alternatively, in other embodiments, computing unit 501 may be configured to perform a multi-agent collaborative hotel recommendation method by any other suitable means (e.g., by means of firmware).

[0095] Various embodiments of the systems and techniques described above herein can be implemented in digital electronic circuit systems, integrated circuit systems, field-programmable gate arrays (FPGAs), application-specific integrated circuits (ASICs), application-specific standard products (ASSPs), systems-on-a-chip (SoCs), payload-programmable logic devices (CPLDs), computer hardware, firmware, software, and / or combinations thereof. These various embodiments may include implementations in one or more computer programs that can be executed and / or interpreted on a programmable system including at least one programmable processor, which may be a dedicated or general-purpose programmable processor, capable of receiving data and instructions from a storage system, at least one input device, and at least one output device, and transmitting data and instructions to the storage system, the at least one input device, and the at least one output device.

[0096] The program code used to implement the methods of this disclosure may be written in any combination of one or more programming languages. This program code may be provided to a processor or controller of a general-purpose computer, special-purpose computer, or other programmable data processing apparatus, such that when executed by the processor or controller, the program code causes the functions / operations specified in the flowcharts and / or block diagrams to be implemented. The program code may be executed entirely on a machine, partially on a machine, as a standalone software package partially on a machine and partially on a remote machine, or entirely on a remote machine or server.

[0097] In the context of this disclosure, a machine-readable medium can be a tangible medium that may contain or store a program for use by or in conjunction with an instruction execution system, apparatus, or device. A machine-readable medium can be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium can be, but is not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, apparatus, or devices, or any suitable combination of the foregoing. More specific examples of machine-readable storage media include electrical connections based on one or more wires, portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing.

[0098] To provide interaction with a user, the systems and techniques described herein can be implemented on a computer having: a display device for displaying information to the user (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor); and a keyboard and pointing device (e.g., a mouse or trackball) through which the user provides input to the computer. Other types of devices can also be used to provide interaction with the user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user can be received in any form (including sound input, voice input, or tactile input).

[0099] The systems and technologies described herein can be implemented in computing systems that include backend components (e.g., as a data server), or computing systems that include middleware components (e.g., an application server), or computing systems that include frontend components (e.g., a user computer with a graphical user interface or web browser through which a user can interact with implementations of the systems and technologies described herein), or any combination of such backend, middleware, or frontend components. The components of the system can be interconnected via digital data communication of any form or medium (e.g., a communication network). Examples of communication networks include local area networks (LANs), wide area networks (WANs), and the Internet.

[0100] Computer systems can include clients and servers. Clients and servers are generally located far apart and typically interact via communication networks. Client-server relationships are created by computer programs running on the respective computers and having a client-server relationship with each other. Servers can be cloud servers, servers in distributed systems, or servers incorporating blockchain technology.

[0101] It should be understood that the various forms of processes shown above can be used to rearrange, add, or delete steps. For example, the steps described in this disclosure can be executed in parallel, sequentially, or in different orders, as long as the desired result of the technical solution of this disclosure can be achieved, and this is not limited herein.

[0102] The specific embodiments described above do not constitute a limitation on the scope of protection of this disclosure. Those skilled in the art should understand that various modifications, combinations, sub-combinations, and substitutions can be made according to design requirements and other factors. Any modifications, equivalent substitutions, and improvements made within the spirit and principles of this disclosure should be included within the scope of protection of this disclosure.

Claims

1. A multi-agent collaborative hotel recommendation method, comprising: In response to a user's hotel search request, determine the user's search intent based on the hotel search request; The first intelligent agent retrieves candidate hotels based on the acquired user profile information, environmental information, and the user's search intent. The second intelligent agent evaluates the candidate hotel set based on the user profile information and the environmental information to determine whether the candidate hotel set meets the user's needs. In response to the fact that the candidate hotel set does not meet the user's needs, the second intelligent agent sends feedback information to the first intelligent agent, triggering the first intelligent agent to regenerate the candidate hotel set according to the feedback information, until the candidate hotel set meets the user's needs; In response to the fact that the candidate hotel set meets the user's needs, a third-party intelligent agent sorts the candidate hotel set according to the user profile information and hotel details, and generates a recommendation result.

2. The hotel recommendation method of claim 1, wherein, Determining the user's search intent based on the hotel search request includes: The hotel search request is parsed to identify the search type of the hotel search request as the user's search intent; wherein, the search type includes regional search and / or central search.

3. The hotel recommendation method of claim 1, wherein, Before the first intelligent agent retrieves a set of candidate hotels based on the acquired user profile information, environmental information, and the user's search intent, the process further includes: The user's historical behavior data is input into the fourth intelligent agent to extract one or more user preference features as user profile information; The user profile information includes at least one of the following user preference features: geographic location preference, hotel attribute preference, and price preference.

4. The hotel recommendation method of claim 1, wherein, The process involves a first intelligent agent recalling candidate hotels based on acquired user profile information, environmental information, and the user's search intent, including: The first intelligent agent determines the corresponding recall strategy based on the user profile information, the environmental information, and the user's search intent. Multiple candidate hotels will be recalled according to the aforementioned recall strategy; A preset number of candidate hotels are selected from the multiple candidate hotels to generate the candidate hotel set.

5. The hotel recommendation method of claim 4, wherein, The step of determining the corresponding recall strategy by the first intelligent agent based on the user profile information, the environmental information, and the user's search intent includes: The recall tool is determined based on the user profile information, the environmental information, and the user search intent; wherein each of the recall tools includes at least one of the following recall conditions: hotel price, hotel location, hotel star rating, and hotel type. The order in which the recall tools are invoked is determined based on their priority. Based on the determined recall tool and the invocation order, a corresponding recall strategy is generated.

6. The hotel recommendation method of claim 1, wherein, The step of responding to the fact that the candidate hotel set does not meet the user's needs by sending feedback information to the first intelligent agent through the second intelligent agent, triggering the first intelligent agent to regenerate the candidate hotel set based on the feedback information, until the candidate hotel set meets the user's needs, further includes: A preset number of interaction rounds between the first agent and the second agent is set. When the interaction between the first agent and the second agent reaches the preset number of interaction rounds, the second agent stops sending the feedback information to the first agent.

7. The hotel recommendation method of claim 1, wherein, The step of sending feedback information from the second agent to the first agent, triggering the first agent to regenerate the candidate hotel set based on the feedback information, includes: The second intelligent agent generates, based on the user profile information and the environmental information, the problem that the candidate hotel set does not meet the user's needs and adjustment suggestions as feedback information, and sends it to the first intelligent agent; The recall strategy of the first intelligent agent is adjusted based on the feedback information; The first intelligent agent generates the candidate hotel set according to the adjusted recall strategy.

8. The hotel recommendation method according to any one of claims 1-7, wherein, The step of sorting the candidate hotel set by a third-party intelligent agent based on the user profile information and hotel details to generate recommendation results includes: The third intelligent agent sorts the candidate hotel set according to the user profile information and the hotel details, and selects n candidate hotels as recommended hotels. Based on the user profile information and the hotel details, a corresponding recommendation reason is generated for each recommended hotel, and the recommended hotel and the recommendation reason are used as the recommendation result.

9. A multi-agent collaborative hotel recommendation device, comprising: The intent determination module is configured to determine the user's search intent based on the user's hotel search request in response to the hotel search request. The recall module is configured to use a first intelligent agent to recall candidate hotels based on the acquired user profile information, environmental information, and the user's search intent. The evaluation module is configured to evaluate the candidate hotel set based on the user profile information and the environmental information through a second intelligent agent, so as to determine whether the candidate hotel set meets the user's needs; The interaction module is configured to, in response to the fact that the candidate hotel set does not meet the user's needs, send feedback information to the first intelligent agent through the second intelligent agent, triggering the first intelligent agent to regenerate the candidate hotel set according to the feedback information, until the candidate hotel set meets the user's needs; The sorting module is configured to, in response to the candidate hotel set meeting the user's needs, sort the candidate hotel set according to the user profile information and hotel details through a third intelligent agent, and generate recommendation results.

10. The hotel recommendation apparatus according to claim 9, wherein, The intent determination module, when determining the user's search intent based on the hotel search request, includes: The hotel search request is parsed to identify the search type of the hotel search request as the user's search intent; wherein, the search type includes regional search and / or central search.

11. The hotel recommendation apparatus according to claim 9, wherein The recall module is used to recall candidate hotels by the first intelligent agent based on the acquired user profile information, environmental information, and the user's search intent. Before obtaining the candidate hotel set, it also includes: The profiling module is configured to input the user's historical behavior data into a fourth intelligent agent to extract one or more user preference features as user profile information; The user profile information includes at least one of the following user preference features: geographic location preference, hotel attribute preference, and price preference.

12. The hotel recommendation apparatus according to claim 9, wherein, The recall module is used by the first intelligent agent to recall candidate hotels based on the acquired user profile information, environmental information, and the user's search intent. When obtaining the candidate hotel set, the module includes: The strategy determination unit is configured to determine the corresponding recall strategy through the first intelligent agent based on the user profile information, the environmental information, and the user's search intent; The recall unit is configured to recall multiple candidate hotels according to the recall strategy; The selection unit is configured to select a preset number of candidate hotels from a plurality of candidate hotels to generate the candidate hotel set.

13. The hotel recommendation apparatus of claim 12, wherein, When the strategy determination unit determines the corresponding recall strategy based on the user profile information, the environmental information, and the user's search intent using the first intelligent agent, it includes: The recall tool is determined based on the user profile information, the environmental information, and the user search intent; wherein each of the recall tools includes at least one of the following recall conditions: hotel price, hotel location, hotel star rating, and hotel type. The order in which the recall tools are invoked is determined based on their priority. Based on the determined recall tool and the invocation order, a corresponding recall strategy is generated.

14. The hotel recommendation apparatus of claim 9, wherein, The interaction module is used to respond to the fact that the candidate hotel set does not meet the user's needs, by sending feedback information from the second agent to the first agent, triggering the first agent to regenerate the candidate hotel set based on the feedback information, until the candidate hotel set meets the user's needs, and further includes: The setting module is configured to pre-set a preset number of interaction rounds between the first agent and the second agent, and to cause the second agent to stop sending the feedback information to the first agent when the interaction between the first agent and the second agent reaches the preset number of interaction rounds.

15. The hotel recommendation apparatus of claim 9, wherein, The interaction module is used to send feedback information to the first intelligent agent through the second intelligent agent, triggering the first intelligent agent to regenerate the candidate hotel set based on the feedback information, including: The second intelligent agent generates, based on the user profile information and the environmental information, the problem that the candidate hotel set does not meet the user's needs and adjustment suggestions as feedback information, and sends it to the first intelligent agent; The recall strategy of the first intelligent agent is adjusted based on the feedback information; The first intelligent agent generates the candidate hotel set according to the adjusted recall strategy.

16. The hotel recommendation apparatus according to any one of claims 9-15, wherein, The sorting module is used by a third-party intelligent agent to sort the candidate hotel set according to the user profile information and hotel details, and when generating recommendation results, it includes: The sorting unit is configured to sort the candidate hotel set according to the user profile information and the hotel details by the third intelligent agent, and select n candidate hotels as recommended hotels from them; The recommendation reason generation unit is configured to generate a corresponding recommendation reason for each recommended hotel based on the user profile information and the hotel details, and to use the recommended hotel and the recommendation reason as the recommendation result.

17. An electronic device comprising: At least one processor; as well as A memory communicatively connected to the at least one processor; wherein, The memory stores instructions that can be executed by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-8.

18. A non-transitory computer readable storage medium having stored thereon computer instructions, wherein, The computer instructions are used to cause the computer to perform the method according to any one of claims 1-8.

19. A computer program product comprising a computer program that, when executed by a processor, implements the method according to any one of claims 1-8.