Smart agent for generating personalized retrieval search queries for a user
By combining LLMs to analyze search queries and user history, the system generates personalized retrieval queries, addressing the limitations of conventional systems by reducing hallucinations and resource usage.
Patent Information
- Authority / Receiving Office
- WO · WO
- Patent Type
- Applications
- Current Assignee / Owner
- GOOGLE LLC
- Filing Date
- 2024-12-20
- Publication Date
- 2026-06-25
AI Technical Summary
Conventional systems fail to provide personalized retrieval search queries for users unfamiliar with a particular field of inquiry, placing the burden on users to navigate and determine relevant queries, and are prone to hallucinations and resource-intensive training requirements.
Utilize a combination of large language models (LLMs) to analyze search queries and user history, generating personalized retrieval queries through a specialized search model, reducing hallucinations and resource usage by bypassing continuous training.
Generates accurate and personalized search queries without continuous training, improving system performance by reducing hallucinations and resource consumption.
Smart Images

Figure US2024061215_25062026_PF_FP_ABST
Abstract
Description
PATENT APPLICATIONAttorney Docket No.: 31730 / 308583-00SMART AGENT FOR GENERATING PERSONALIZED RETRIEVAL SEARCH QUERIES FOR A USERFIELD OF TECHNOLOGY
[0001] The present disclosure relates to generating personalized content recommendations to a user and, more specifically, to techniques for generating personalized retrieval search queries for a user based on a user query and / or search history.BACKGROUND
[0002] The background description provided herein is for the purpose of generally presenting the context of the disclosure. Work of the presently named inventor(s), to the extent it is described in this background section, as well as aspects of the description that may not otherwise qualify as prior art at the time of filing, are neither expressly nor impliedly admitted as prior art against the present disclosure.
[0001] When serving content to a user, conventional systems serve individual items and / or provide recommendations directly associated with a particular search query. However, when users are unfamiliar with a particular field of inquiry, conventional systems place the burden of determining relevant queries and how to navigate the field on the user.SUMMARY
[0002] In some aspects, the techniques described herein relate to a computer- implemented method for generating retrieval queries for a user, the computer-implemented method including: receiving, by one or more processors, a search query from a user associated with an account profile, the account profile including a user history; generating, by the one or more processors and based on the search query, a context attribute summary indicative of a context for the search query; generating, by the one or more processors and based on the user history, a summarized user browsing history indicative of one or more past searches performed by the user; generating, by the one or more processors and using the context attribute summary and the summarized user browsing history as inputs to a trained language model, a plurality of retrieval queries; and transmitting, by the one or more processors, the plurality of retrieval queries to a search model to generate one or more recommendations associated with the search query for the user.
[0003] In some aspects, the techniques described herein relate to a computer- implemented method, wherein the generating of the context attribute summary includes:PATENT APPLICATIONAttorney Docket No.: 31730 / 308583-00 receiving, by the one or more processors, a plurality of search results associated with the search query; scraping, by the one or more processors, one or more descriptions associated with the plurality of search results; and analyzing, by the one or more processors, the one or more descriptions to generate the context attribute summary.
[0004] In some aspects, the techniques described herein relate to a computer- implemented method, wherein the scraping is performed by a trained large language model.
[0005] In some aspects, the techniques described herein relate to a computer- implemented method 2 or 3, wherein the scraping includes: retrieving, by the one or more processors, unstructured data associated with a description of at least one of the plurality of search results; and analyzing, by the one or more processors, the unstructured data to detect one or more salient words to generate the context attribute summary.
[0006] In some aspects, the techniques described herein relate to a computer- implemented method, wherein the scraping includes: retrieving, by the one or more processors, image data associated with at least one of the plurality of search results; analyzing, by the one or more processors and using an image analysis model, the image data to generate one or more visual details associated with the image data; and generating, by the one or more processors, the context attribute summary based on the one or more visual details.
[0007] In some aspects, the techniques described herein relate to a computer- implemented method, wherein the scraping includes: retrieving, by the one or more processors, review data associated with at least one of the plurality of search results; and analyzing, by the one or more processors, the review data to generate the context attribute summary.
[0008] In some aspects, the techniques described herein relate to a computer- implemented method, wherein the trained language model is a multi-task transformer model.
[0009] In some aspects, the techniques described herein relate to a computer- implemented method, wherein the context attribute summary includes at least one or more attributes associated with the search query.
[0010] In some aspects, the techniques described herein relate to a computer- implemented method, wherein the context attribute summary additionally includes at least one or more brands associated with the search query.PATENT APPLICATIONAttorney Docket No.: 31730 / 308583-00
[0011] In some aspects, the techniques described herein relate to a computer- implemented method, wherein the plurality of retrieval queries includes at least a combination of the at least one or more attributes and the at least one or more brands.
[0012] In some aspects, the techniques described herein relate to a computer- implemented method, wherein generating the one or more recommendations includes: receiving, by the one or more processors and from the search model, search results associated with the plurality of retrieval queries; and generating, by the one or more processors and using a trained large language model, the one or more recommendations based on the received search results.
[0013] In some aspects, the techniques described herein relate to a computer- implemented method, wherein generating the one or more recommendations includes: generating, by the one or more processors, a summary for each of the one or more recommendations .
[0014] In some aspects, the techniques described herein relate to a computer- implemented method, wherein the summary includes one or more detailed attributes associated with a corresponding recommendation of the one or more recommendations.
[0015] In some aspects, the techniques described herein relate to a computer- implemented method 12 or 13, wherein the summary includes an explanation for a corresponding recommendation of the one or more recommendations, the explanation including an indication of which attributes of the corresponding recommendation are predicted to be desirable to the user.
[0016] In some aspects, the techniques described herein relate to a computer- implemented method, further including: generating, by the one or more processors, one or more prompts for additional information to display to the user; updating, by the one or more processors, the context attribute summary based on the additional information.
[0017] In some aspects, the techniques described herein relate to an apparatus, functioning as a server device, including: a transceiver; and one or more processors configured to perform a method according to any one of the preceding claims.BRIEF DESCRIPTION OF THE DRAWINGS
[0003] FIG. 1 is a block diagram of an example system in which techniques of the present disclosure can be implemented.PATENT APPLICATIONAttorney Docket No.: 31730 / 308583-00
[0004] FIG. 2 A depicts an example artificial intelligence model that may be implemented in the system of FIG. 1.
[0005] FIG. 2B depicts an example large language model that may be implemented in the system of FIG. 1.
[0006] FIG. 3 depicts a block diagram of an example system for analyzing data associated with a search query and, based on such, generating retrieval queries and associated recommendations personalized for a user, implemented in the system of FIG. 1.
[0007] FIG. 4 depicts a diagram of an example user interface for providing recommendations associated with an input search query to a user, implemented in the system of FIG. 1.
[0008] FIG. 5 is a flow diagram of an example method for determining and generating retrieval queries for a user, implemented in the system of FIG. 1.DETAILED DESCRIPTION OF THE DRAWINGS
[0018] Because conventional systems are designed to provide a singular answer rather than generating queries to predict next steps for a user searching within a field, conventional systems are unable to properly predict and / or provide recommendations for logical next steps to guide a user through an overarching search session.
[0019] One potential solution is to address such a problem by using large language models (LLMs) trained on user data to generate large sets of recommendations as an attempt to broadly provide a number of potentially related details. However, such systems would fail to consider nuances of language or visual descriptions regarding particular queries and / or content items. Moreover, such systems would require prohibitive and continuous training times to keep the model up to date on both user preferences and potential results. Still further, such LLMs would be vulnerable to hallucinations regarding the output recommendations and / or results, and may therefore provide results that are incorrect.
[0020] By using multiple models to perform different elements of the search, analysis, and recommendation generation process, the instant techniques may eliminate or otherwise mitigate the problems detailed above. Notably, by using a first set of one or more models that analyze a search query and user history to generate predictions and summaries for the user before generating personalized retrieval queries based on such, and a second set of one or more models that perform the searches using the retrieval queries, the instantPATENT APPLICATIONAttorney Docket No.: 31730 / 308583-00 techniques reduce overall hallucination and / or error rate of the models as well as improving overall resources by reducing and / or removing training requirements.
[0021] More particularly, by using large language model(s) (LLM(s)) to analyze the descriptive elements of a content item (e.g., descriptions, images, reviews, etc.) and elements of a user account history and / or search query, a system implementing the instant techniques may generate personalized search queries. The system may then utilize a specialized search model by feeding the personalized search queries as an input to determine recommendations to generate for the user. Utilizing the personalized search queries with the specialized search model further reduces risk of hallucinations, as the search model will be less likely to find any recommendations that substantially deviate from those search queries. This in effect enables a system to exclude such queries from the ultimate recommendation generation.
[0022] In some implementations, the search query may be a broad and / or vague search query. In such implementations, because the search query is broad and / or vague, a conventional model may generate hallucinations and / or unrelated or inaccurate results. For example, an initial search query of “ski jacket” as analyzed by a conventional model may generate incorrect results (e.g., a child ski jacket when the user is an adult) and / or hallucinations for products that do not exist or are not accurate (e.g., French press exercises when the user is interested in French press coffee). As such, the instant techniques may generate additional data based on the search query and / or other factors to improve the overall input into a model and generate more accurate and existent options.
[0023] Moreover, by enabling a system to analyze a user profile and search queries directly without specifically training or tuning the models to a particular user, the instant techniques reduce the overall computer resource usage and time spent training models, especially in cases where there is a large number of such users. Notably, by analyzing the user history and associated content items or search queries directly using the LLM, the instant techniques allow for determination and prediction of a user preferred action without requiring particular training. As such, and without the need to train the model on constantly up-to-date content items, the instant techniques bypass the traditional requirements for training while outputting similarly accurate results, improving the overall performance of a system implementing the instant techniques.
[0024] Notably, a logical solution to such a problem may be to use a single a model to analyze such queries, the model trained using (i) particular user data to improve the ability toPATENT APPLICATIONAttorney Docket No.: 31730 / 308583-00 analyze a particular user, (ii) results data to ensure that the model provides updated and accurate data (e.g., that new results are not being missed and / or no-longer relevant results are not being provided), (iii) semantic data to attempt to mitigate potential hallucinations, and / or (iv) any other such data or combination thereof. Such training would be resource intensive and difficult to perform in real time and / or with the frequency needed to stay updated. However, by utilizing simple LLM models as described herein, the need for such training may be obviated, and the overall system improved while maintaining the benefits of such. As such, the systems and methods described herein offer an improvement over traditional techniques.
[0025] FIG. 1 illustrates an example system 100 in which the techniques disclosed herein may be implemented. The example system 100 includes a client device 102, a computing system 104, a search engine 106, a content database 108, and a network 110. The computing system 104 in some implementations is remote from the client device 102 and / or search engine 106, and communicatively coupled to the client device 102 and / or search engine 106 via the network 110. It will be understood that system 100 is exemplary, and that other systems may include additional, fewer, or alternative components (e.g., training module 156 may be omitted, data extraction module 150 and personalization module 152 may be combined, etc.). Similarly, arrangements of the components of system 100 may be modified. For example, some elements of system 100 may be combined, split apart, swapped, etc.
[0026] The network 110 may be a single communication network (e.g., the Internet), and in some implementations also includes one or more additional networks. As an example, the network 110 may include a cellular network, the Internet, and a server-side local area network (LAN). While FIG. 1 shows only a single client device 102, computing system 104, and search engine 106, it will be understood that the system 100 may include any suitable number of similar client devices, computing devices, and / or databases operating according to the principles disclosed herein.
[0027] Generally, the client device 102 is configured to access information resources (e.g., web pages, application user interfaces, etc.) that may be supplied or published by the computing system 104, content providers (e.g., storing content in content database 108), and / or other entities, and the computing system 104 is generally configured to analyze and select content to be served to the client device 102 along with links to information resources (e.g., landing pages) associated with the sponsored content options. As used herein,PATENT APPLICATIONAttorney Docket No.: 31730 / 308583-00“sponsored content” may refer to advertisement / marketing content, or any other type of content that is provided by and / or otherwise associated with a third party (e.g., a party other than an entity associated with the computing system 104 of FIG. 1). The information resources, and / or content items (e.g., digital advertisements) associated with the information resources, may be stored in content databases such as content database 108, and may be accessed by the client device 102 and / or computing system 104 through a search engine 106 (e.g., a device including a search engine model and / or module). In other implementations, the search engine 106 and / or content database 108 is instead a part of the computing system 104.
[0028] The client device 102 may be or include any stationary, mobile, or portable computing device with wired and / or wireless communication capability (e.g., a smartphone, a tablet computer, a laptop computer, a desktop computer, a smart wearable device such as smart glasses or a smart watch, a vehicle head unit computer, etc.). In the example implementation of FIG. 1, the client device 102 includes a network interface 120, a processor 122, memory 124, and a display 126. The processor 122 may be a single processor (e.g., a central processing unit (CPU)), or may include a set of processors (e.g., multiple CPUs, or one or more CPUs and one or more graphics processing units (GPUs)).
[0029] The memory 124 includes one or more computer-readable, non-transitory storage units or devices, which may include persistent (e.g., hard disk) and / or non-persistent memory components. The memory 124 stores instructions that are executable by the processor 122 to perform various operations, including the instructions of various software applications and the data generated and / or used by such applications. In the example implementation of FIG. 1, the memory 124 stores at least an application 130, which may be, for example, a web browser application, a mobile application downloaded from an application store, or a video player application.
[0030] Generally, application 130 is executed by processor 122 to present information resources, text data, image data, audio data, etc. to the user of the client device 102 via the display 126 (and / or one or more speakers of the client device 102, not shown in FIG. 1). In an implementation where the application 130 is a web browser application, for instance, an information resource may be a web page hosted by a publisher or a content provider, with the web browser causing the client device 102 to download HyperText Markup LanguagePATENT APPLICATIONAttorney Docket No.: 31730 / 308583-00(HTML), scripts, and / or other code of the web page for presentation to a user via the display 126.
[0031] The display 126 includes hardware, firmware, and / or software configured to enable a user to view visual outputs of the client device 102, and may use any suitable display technology (e.g., LED, OLED, LCD, etc.). In some implementations, the display 126 is incorporated in a touchscreen having both display and manual input capabilities. Moreover, in some implementations where the client device 102 is a wearable device, the display 126 is a transparent viewing component (e.g., lenses of smart glasses) with integrated electronic components. For example, the display 126 may include micro-LED or OLED electronics embedded in lenses of smart glasses.
[0032] The network interface 120 includes hardware, firmware, and / or software configured to enable the client device 102 to exchange electronic data with the computing system 104 via the network 110. For example, the network interface 120 may include a cellular communication transceiver, a Wi-Fi transceiver, and / or transceivers for one or more other wired and / or wireless communication technologies.
[0033] While FIG. 1 shows client device 102 as a single component communicating directly (i.e., via network 110) with the computing system 104, in some implementations the subcomponents of client device 102 shown in FIG. 1 are instead divided among two or more user-side devices. As just one example, a pair of smart glasses may include the processor 122, the memory 124, and the display 126, while a smartphone may include another processing unit, another memory, another display, and the network interface 120. The smart glasses (or smart helmet, etc.) may then communicate as needed with the smartphone (e.g., via Bluetooth) to enable the operations described herein.
[0034] The computing system 104 includes a network interface 140, a processor 142, and memory 144. The network interface 140 includes hardware, firmware, and / or software configured to enable the computing system 104 to exchange electronic data with the client device 102 and other, similar client devices via the network 110. For example, the network interface 140 may include a wired or wireless router and a modem. The processor 142 may be a single processor, may include two or more processors, etc. The computing system 104 may include one or more servers, for example, which may reside at a single location or multiple locations.PATENT APPLICATIONAttorney Docket No.: 31730 / 308583-00
[0035] The memory 144 is a computer-readable, non-transitory storage unit or device, or collection of units / devices that may include persistent and / or non-persistent memory components. The memory 144 stores the instructions of a data extraction module 150, a personalization module 152, a recommendation module 154, and a training module 156, each of which may be executed by the processor 142. In the example system 100, the data extraction module 150 includes (or remotely accesses) a scraping module 160 and / or a visual scraping module 162. The personalization module 152 includes (or remotely accesses) a summary generation module 164 and / or a retrieval query generation module 166. The recommendation module 154 includes (or remotely accesses) a guidance module 168 and / or an attribute detail module 170. The training module 156 uses historical data 172 and / or generated data 174 to train one or more machine learning models (e.g., the scraping module 160, summary generation module 164, retrieval query generation module 166, etc.). In some implementations, some of the software modules / units shown in FIG. 1 are omitted. For example, the recommendation module 154 may omit the attribute detail module 170, or the computing system 104 may omit training module 156 (e.g., if the training is done by a different computing system). The data extraction module 150, personalization module 152, and / or recommendation module 154 may be or include an LLM or another suitable generative Al model. As another example, the data extraction module 150 and personalization module 152 include an LLM while the recommendation module includes a non-LLM model.
[0036] The data extraction module 150, personalization module 152, recommendation module 154, and / or training module 156 may be software modules comprising instructions executed by the processor 142 to perform the various operations described herein. It is understood, however, that other architectures are also possible (e.g., with functionality of modules 150, 152, and / or 154 being provided by a single software module, or with functionality of data extraction module 150 being split among a plurality of software modules, and so on).
[0037] Generally, the data extraction module 150 uses a scraping module 160 and / or a visual scraping module 162 to analyze content items stored in the content database 108 and / or accessed via a search engine 106. In some implementations, a user provides a search query to the client device 102, which transmits the search query to the computing system 104 via the network 110. The data extraction module 150 may then access one or more information resources associated with content (e.g., products, documents, services, etc.).PATENT APPLICATIONAttorney Docket No.: 31730 / 308583-00Using the scraping module 160, the data extraction module 150 may scrape text associated with the information resources. For example, the scraping module 160 may analyze a description of a product, one or more user reviews associated with a product, a title of a webpage, etc. and generate a list of attributes related to the search query. In some implementations, the scraping module 160 is, includes, or functions similarly to an LLM as described below with regard to FIGs. 2A and 2B. In particular, the scraping module 160 may be trained to determine a list of relevant attributes associated with the search query by detecting one or more salient words in the description, one or more words associated with descriptive attributes, one or more words associated with related brands, etc. Depending on the implementation, the scraping module 160 may determine salient words and / or other such attributes using vector embedding and / or comparison techniques as discussed herein.
[0038] In some implementations, the scraping module 160 may analyze structured data sets, unstructured data sets, and / or combinations thereof. In further implementations, the scraping module 160 may analyze the description to detect one or more nuanced categories that may not be predefined. For example, a description of a ski jacket may include that the ski jacket is ‘insulated’, which may not be well-defined and / or different across multiple brands and / or retailers. Similarly, a style or material may be difficult to define through semantic and / or LLM-based comparisons alone unless structured (e.g., through explicit definition by a publisher or via the information resource), and conventional systems may not be able to determine differences or similarities of such. As such, the scraping module 160 may analyze the description and treat structured and unstructured data separately. For example, the scraping module 160 may generate attributes based on structured data directly, but may instead pull unstructured data as associated with related category words (e.g., style, material, etc.) and generate attributes using the category words in conjunction with the nuanced unstructured data instead. In further implementations, the computing system 104 may generate a prompt for a user to confirm whether an unstructured data term is of relevance to the search query.
[0039] In some implementations, the data extraction module 150 is or includes multimodal models to analyze multimodal data. In some such implementations, the data extraction module 150 includes a visual scraping module 162, an audio scraping module (not shown), and / or any other such multimodal scraping modules (not shown). The visual scraping module 162 may analyze one or more images associated with the content related to the search query. For example, the visual scraping module 162 may include one or morePATENT APPLICATIONAttorney Docket No.: 31730 / 308583-00 image analysis modules (e.g., using optical character recognition (OCR), image segmentation techniques, convolutional neural networks (CNNs) and / or other neural networks configured to analyze images, etc.) to analyze images associated with the content item(s) and generate descriptive attributes based on such (e.g., size, color, shape, purpose, etc.).
[0040] Depending on the implementation, the attributes generated by the scraping module 160 and / or visual scraping module 162 may be or include descriptive attributes (e.g., size, color, shape, purpose, etc.), brand names, author names, etc. In further implementations, the attributes may be or include predictive attributes as well (e.g., attributes that the user is likely to seek guidance regarding, perform searches regarding, etc.). The data extraction module 150 may then provide the generated attributes to the personalization module 152 for use in generating retrieval queries.
[0041] In some implementations, the personalization module 152 uses a summary generation module 164 to generate a user historical data summary. The personalization module 152 may then use a retrieval query generation module 166 to generate a list of personalized retrieval queries based on the attributes from the data extraction module 150 and the user historical data summary from the summary generation module 164.
[0042] In some such implementations, the summary generation module 164 analyzes a user history associated with a user. Depending on the implementation, the user may have a user account associated with the computing system 104 via which the user logs into an application 130. In further implementations, the user may have a browsing history stored via saved data, browser data, cookies, etc. that the computing system 104 may access to analyze the user history. The summary generation module 164 may summarize past searches by the user in generating a summary. In some implementations, the summary generation module 164 may only include relevant searches in the summary. For example, the summary generation module 164 may only include data from past searches in the summary if the past searches are determined to meet a relevance threshold with the current search query (e.g., using LLM analysis techniques as described below with regard to FIGs. 2A and 2B).
[0043] The retrieval query generation module 166 may then analyze the summary and the list of attributes from the summary generation module 164 and data extraction module 150, respectively. Depending on the implementation, the retrieval query generation module 166 may generate a list of retrieval queries based on the list of attributes and then modify (e.g., remove, add, or change) queries based on the generated summary. In furtherPATENT APPLICATIONAttorney Docket No.: 31730 / 308583-00 implementations, the retrieval query generation module 166 performs the opposite, and generates the list of retrieval queries based on the summary and then modifies the queries based on the list of attributes. In still further implementations, the retrieval query generation module 166 may generate the list of retrieval queries based on both outputs in conjunction with one another. Depending on the implementation, the list of retrieval queries may be or include queries to particular attributes, queries associated with particular brands, combinations of other retrieval queries, etc. The personalization module 152 may then provide the list of retrieval queries to the search engine 106.
[0044] The search engine 106 may perform a search for content using the retrieval queries and may eliminate any hallucinations in the generated list of queries. In further implementations, the search engine may return output search results to the recommendation module 154, which may eliminate any hallucinations in the generated list of queries based on the output. For example, the search engine and / or the recommendation module 154 may remove any queries that return no results, and / or may otherwise ensure that only correct retrieval queries are considered. The recommendation module 154 may generate one or more recommendations for the user based on the search query and the returned search results. For example, the recommendations may include one or more content items, webpages, brands, etc. as recommendations.
[0045] In further implementations, the attribute detail module 170 may generate a summary of the recommendation including one or more explanations. For example, the attribute detail module 170 may include one or more indications of which search retrieval queries led to which results, which attributes caused the model(s) to recommend a particular content item, how the content item relates to the initial search query, etc. Depending on the implementation, the attribute detail module 170 may be or include an LLM configured to generate such summaries, as described in more detail below.
[0046] In some implementations and / or scenarios, the computing system 104 (or another computing system not shown in FIG. 1) trains the models of the data extraction module 150, the personalization module 152, and / or the recommendation module. In particular, the training module 156 may train the modules using historical data 172 (e.g., from past searches, the search engine 106, etc.) and / or generated data 174 (e.g., as generated by the data extraction module 150, personalization module 152, etc.) as described herein. In some implementations, the historical data 172 is generalized data rather than personalized userPATENT APPLICATIONAttorney Docket No.: 31730 / 308583-00 data. For example, the training module 156 may use a filtering model to determine that users who broadly prefer X and Y may prefer Z as well to broadly train the models and / or modules on populations rather than individuals. As such, the training module 156 reduces the overall resources used to train the models by refraining from generating individualized models for every user and occasion, but instead using an already trained search engine to provide information and relying on individual analysis of user history instead. In some implementations, the historical data 172 includes data provided by content providers (e.g., via content database 108). In some implementations, training module 156 is included in a computing system other than computing system 104, and computing system 104 only includes or accesses the models and / or modules in question after the model(s) / module(s) is / are trained. In some implementations, training machine learning models may produce byproduct weights, or parameters which may be initialized to random values. The training module 156 may modify the weights as the network is iteratively trained, by using one of several gradient descent algorithms, to reduce loss and to cause the values output by the network to converge to expected (or “learned”) values.
[0047] In some implementations, as noted above, the modules and / or models may be or include a generative Al model and may have been trained by computing system 104 or another computing system using supervised or semi- supervised learning techniques, using training data of the appropriate modality (e.g., text data). Such generative Al models may be general -purpose models (e.g., trained on a wide array of publicly available datasets such as web pages, documents, etc., available via the Internet) or may be a domain- specific model (e.g., trained or finetuned on custom and / or proprietary datasets, such as documents / data available via one or more intranets). In some implementations, the generative Al models have parameters tuned, via the training process, specifically for high performance in the context of generating text having one or more particular qualities and / or characteristics.
[0048] In some implementations, the computing system 104 accesses a remote server / system that provides generative Al as a service (i.e., with at least a portion of the data extraction module 150, the personalization module 152, and / or the recommendation module 154 residing at a location remote from the computing system 104). In other implementations, the data extraction module 150, the personalization module 152, and / or the recommendation module 154 are local to the computing system 104. Thus, the data extraction module 150, the personalization module 152, and / or the recommendation module 154 may reside at the computing system 104 as shown in FIG. 1, or the computing system 104 may access the dataPATENT APPLICATIONAttorney Docket No.: 31730 / 308583-00 extraction module 150, the personalization module 152, and / or the recommendation module 154 by communicating with another computing system via the network 110. For example, the data extraction module 150, the personalization module 152, and / or the recommendation module 154 may be or include Al models that a remote server makes available to computing systems (including computing system 104) via an application programming interface (API).
[0049] The historical data 172 and / or generated data 174 may generally include any text data used for training purposes. The historical data 172 and / or generated data 174 may include, for example, search data, summary data, scraped data, and / or historical data for such metrics as described herein.
[0050] The operation of the data extraction module 150, the personalization module 152, the recommendation module 154, the training module 156, and their constituent parts, will be discussed in further detail below in connection with various example implementations .
[0051] FIGs. 2A and 2B depict exemplary models that may be used (or parts of which may be used) as part of the data extraction module 150, the personalization module 152, and / or the recommendation module 154, for example. It is understood, however, that these are just some of a number of suitable Al model types that may be used by the computing system 104.
[0009] Turning first to FIG. 2A, an exemplary model 200A uses generative Al techniques. The model 200A may be used as part of the data extraction module 150, the personalization module 152, and / or the recommendation module 154, for example. In particular, a generator model 210 and a discriminator model 220 receive inputs to generate a binary classification 235 and output a sequence of words and / or other metrics as described herein. In particular, the generator model 210 receives an input vector 205A to generate a generated example 215. In some implementations, the input vector 205 A is a fixed- length random vector. In some implementations, the input vector 205A may be drawn randomly from a Gaussian distribution. Depending on the implementation, the vector space corresponding to the input vector 205 A may include one or more hidden variables (e.g., variables that are not directly observable). In some implementations, the input vector 205 A is used to seed the generative process. Using the input vector 205A, the generator model 210 then generates a generated example 215.PATENT APPLICATIONAttorney Docket No.: 31730 / 308583-00
[0010] In some implementations, the discriminator model 220 then receives the generated example 215 or a real example 225. The discriminator model 220 may generate a binary classification 235 inferring / indicating whether the received input is model-generated or real. The exemplary model 200A may additionally output an output product and / or use the binary classification 235 in training the generator model 210 and / or discriminator model 220.
[0011] In still further implementations, the exemplary model 200A uses both the generator model 210 and the discriminator model 220 for training and subsequently uses only the generator model 210 for generative modeling as described herein.
[0012] In some implementations, the generator model 210 and the discriminator model 220 are trained according to adversarial techniques (e.g., when the discriminator model 220 correctly generates the binary classification 235, the generator model 210 is updated and, when the discriminator model 220 incorrectly generates the binary classification 235, the discriminator model 220 is updated).
[0013] Depending on the implementation, the generator model 210 and / or the discriminator model 220 may be or include neural networks, such as artificial neural networks (ANN), convolution neural networks (CNN), or recurrent neural networks (RNN). In further implementations, the model 200A, the generator model 210, and / or the discriminator model 220 may incorporate, include, be, and / or otherwise use techniques including and / or in a manner reminiscent to language model techniques (e.g., an LLM, a bag- of-words model, etc.). Similarly, the model 200A, the generator model 210, and / or the discriminator model 220 may incorporate, include, be, and / or otherwise use a transformer architecture to utilize the appropriate language model techniques, as described with regard to FIG. 2B below.
[0014] FIG. 2B illustrates an exemplary LLM 200B, which receives an input vector 205B similar to input vector 205A and provides an output 260. The LLM 200B may be used as and / or in the data extraction module 150, the personalization module 152, and / or the recommendation module 154, for example. In some implementations, the LLM 200B is initially trained to predict a word and / or event in a sequence of words and / or events. For example, the LLM 200B may be given a word sequence that leads up to “Today is a,” and predict a next word, such as “sunny day”, “Saturday”, “holiday”, etc. Similarly, the LLM 200B may be trained to generate an event in a series of events. For example, the LLM 200B may be given a series of events that leads up to a content item being displayed to a user andPATENT APPLICATIONAttorney Docket No.: 31730 / 308583-00 predicting a user response, such as clicking through the content item to a webpage, ignoring the content item, etc.
[0015] In some implementations, transformers are used to train the LLM 200B (e.g., a generative pre-trained transformer (GPT) model). More specifically, some implementations use a GPT model that includes (i) an encoder that processes the input sequence, and (ii) a decoder that generates the output sequence. The encoder and decoder may both include a multi-head self-attention mechanism that allows the GPT model to differentially weight parts of the input sequence to infer meaning and context (e.g., using metadata in the historical and / or training data), for example.
[0016] The input vector 205B may be a vector representative of relationships between words, sequences, etc. in the input. The LLM 200B may include a self-attention block 252 component to attend to different parts of the input simultaneously or near- simultaneously to capture relationships and / or dependencies between the different parts of the input (e.g., referred to as a multi self-attention block, multi-head attention block, multi-head selfattention block, masked multi self-attention block, masked multi-head attention block, masked multi-head self-attention block, etc.). In particular, the self-attention block 252 relates different positions of a sequence to compute a representation of the sequence. As such, the self-attention block 252 may weigh an impact of different words in a sequence when sequencing. As such, the LLM 200B learns to give emphasis to different portions of an input vector 205B.
[0017] The self-attention block 252 may then compute an attention score representing the impact of each word and / or event in the sentence with respect to the other words and / or events in the sentence (e.g., by taking a dot product between different vector sets). The output then proceeds to the normalization layer 254. The normalization layer 254 may normalize the output of the self-attention block 252 (e.g., by applying a softmax function to normalize the scores).
[0018] Similarly, the self-attention block 252 may provide output to a feed-forward network block 256, which performs a non-linear transformation to generate a new representation of the input and / or relationships between words, sequences, etc. In particular, the feed-forward network block 256 may compute a weighted sum of the vectors, using the calculated and normalized attention scores to capture the contextual relationships between words. In some implementations, the normalization layer 254 and / or the self-attention blockPATENT APPLICATIONAttorney Docket No.: 31730 / 308583-00252 performs the computation to generate a representation of the relationship between words, etc. After the feed-forward network block 256, an additional normalization layer 258 may normalize the respective output and / or add residual connection(s) to allow the output to move directly to another input. The LLM 200B may therefore learn which parts of an input are important (e.g., remain prevalent through the normalization process). Depending on the implementation, the training of LLM 200B may repeat the process any suitable number of times.
[0019] Depending on the implementation, an encoder and / or a decoder may be trained as described above. In further implementations, the encoder is trained in accordance with the above, and a decoder includes an additional self-attention block (not shown) receiving the output of the encoder.
[0052] FIG. 3 depicts an example block diagram 300 for analyzing data associated with a search query (e.g., a search for content) and, based on such, generating retrieval queries and associated recommendations personalized for a user. Depending on the implementation, a computing system (e.g., computing system 104 of FIG. 1) may perform the actions and / or generate the outputs as described herein via a data extraction module 310, personalization module 320, and / or recommendation module 330. In some such implementations, the data extraction module 310, personalization module 320, and / or recommendation module 330 correspond to the data extraction module 150, personalization module 152, and / or recommendation module 154 as described above with regard to FIG. 1.
[0053] In some implementations, the data extraction module 310 includes a content search module 314 and / or a large language model (LLM) 316. The content search module 314 may initially receive a content link 312 (e.g., a URL, a pointer, a portion of the content item, etc.). In some implementations, the content link is received from a search module and / or model responsive to an initial search query (e.g., query 322). In further implementations, the content search module 314 receives an indication of the content link 312 from the user (e.g., the user inputting the content link 312, the user clicking on the content link, the user indicating the content link responsive to a prompt from the computing system 104, etc.).
[0054] The content search module 314 then accesses an information resource associated with the content link 312 to access a content description 315. The content description 315 may be or include a literal description of a content item (e.g., a productPATENT APPLICATIONAttorney Docket No.: 31730 / 308583-00 description of an item associated with the content link 312, an abstract of a paper associated with the content link 312, a page summary of a webpage associated with the content link 312, etc.). In further implementations, the content description 315 may be or include additional or alternate elements (e.g., a title for a page, metadata associated with an information resource, user reviews associated with a product, etc.). In still further implementations, the content description 315 may include multimodal data, such as text data, visual data (e.g., an image, a video, etc.), audio data, etc.
[0055] The LLM 316 may then analyze the content description 315 to generate one or more content attributes 318 associated with the content description 315. Depending on the implementation, the LLM 316 may analyze the content description 315 using vector embedding techniques. In particular, the LLM 316 may convert elements of the content description 315 into fixed-size vectors representative of tokenized concepts (e.g., the words of the content description 315) and analyze the values of the tokenized concepts in vector space to determine relevant and / or salient words present in the content description. For example, the LLM 316 may analyze the content description to detect the presence of various content attributes 318, such as descriptive words (e.g., related to color, size, shape, purpose, etc.), brand names, product categories, etc. The content attributes 318 may be representative of one or more predictions for content the user is interested in and / or may search for in the future based on a particular content item and / or query.
[0056] In some implementations, the LLM 316 is a multimodal LLM, and may therefore parse additional modal data besides text. For example, the LLM 316 may analyze image data (e.g., product pictures, brand logos, etc.) to determine one or more visual attributes 319 associated with the content description 315. Similarly, the LLM 316 may analyze audio data, video data, and / or any other such multimodal data.
[0057] In some implementations, the personalization module 320 may receive a query 322 (e.g., a search query) from a user at a multi-task transformer model 325. The multi-task transformer model 325 may access, retrieve, and / or otherwise receive a user account history for the user associated with the query 322. In further implementations, the personalization module 320 receives an indication from a user to generate recommendations and analyzes the user account history responsive to the indication instead.
[0058] The multi-task transformer model 325 may then generate a user summary 324 based on the user account history. For example, the multi-task transformer model 325 mayPATENT APPLICATIONAttorney Docket No.: 31730 / 308583-00 analyze one or more searches in the user account history to determine a summary of what the user may be searching for and / or areas of interest for the user. In some such implementations, the multi-task transformer model 325 may generate one or more prompts for the user to confirm the user summary 324, to clarify a point of confusion in the user summary 324, to provide additional information, etc. Depending on the implementation, the multi-task transformer model 325 may be or include an LLM such as the LLM 316 and may analyze past search queries of the user search history associated with the user account similarly. In other implementations, the data extraction module 310 and personalization module 320 are the same module and / or otherwise call the same model (e.g., the multi-task transformer model 325 and / or LLM 316) to generate both the user summary 324 and the content attributes 318.
[0059] In some implementations, the multi-task transformer model 325 analyzes the user summary 324 and the content attributes 318 to generate one or more retrieval queries 326. In particular, the multi-task transformer model 325 may generate a list of retrieval queries based on the content attributes 318 and / or predictions generated from such. The multi-task transformer model 325 may further modify the list of retrieval queries 326 based on the user summary 324 (and / or vice versa). Depending on the implementation, the multitask transformer model 325 may generate the list of retrieval queries 326 to include a predetermined number of retrieval queries 326, a predetermined maximum number of retrieval queries 326, a predetermined minimum number of retrieval queries 326, retrieval queries 326 with a relevance value at least meeting a predetermined threshold, etc.
[0060] For example, the multi-task transformer model 325 may analyze the content attributes to determine that a user is presently searching and / or predicted to search for a ski jacket. Based on the content attributes 318, the multi-task transformer model 325 may determine that the user is interested in a ski jacket that is sleek, light, warm, cheap, etc. The multi-task transformer may further determine that the user is interested in designs from companies A, B, and C. However, the user summary may indicate that jackets from company C may be too expensive (e.g., based on past searches regarding company C ski jackets in conjunction with phrases like “discount”, “cheap”, “used”, etc.), and may therefore eliminate or remove retrieval queries associated with company C and / or add search queries associated with discounts or sales in conjunction with company C.PATENT APPLICATIONAttorney Docket No.: 31730 / 308583-00
[0061] The personalization module 320 may then transmit the retrieval queries 326 and / or otherwise feed the retrieval queries 326 to a search model 327 to generate and / or retrieve one or more content items 328 associated with the retrieval queries 326. In some implementations, the search model 327 pulls associated content items 328 from a database of content items on the order of 100 items, 1000 items, 10000 items, etc.
[0062] In some implementations, the search model 327 and / or multi-task transformer model 325 are or include a pre-trained model that does not require additional training (e.g., based on the individual user data). As such, the additional training that other models may require may be unnecessary and may therefore reduce the overall computing resources utilized and required.
[0063] The recommendation module 330 may then analyze the query 322, user summary 324, content items 328, and / or content attributes 318 via a recommendation model 332 to generate a recommendation 334. Depending on the implementation, the recommendation 334 may include one or more example content items 328, predicted / recommended future searches, and / or summaries of such. For example, the recommendation 334 may include one or more ski jackets for sale along with summaries of why the recommendation 334 includes the one or more ski jackets (e.g., related content attributes 318, how the computing system 104 interpreted the query 322, past searches in the user summary 324, etc.).
[0064] In some implementations, the recommendation model 332 and / or the multitask transformer model 325 determines whether to ask the user additional questions. For example, the recommendation model 332 and / or the multi-task transformer model 325 may determine that the user search query is too broad to determine recommendations with enough certainty (e.g., meeting a predetermined confidence metric / score), and may generate prompts for the user to better direct the search and / or recommendation. Depending on the implementation, the prompts for the user may be text prompts, image prompts, audio prompts, a combination thereof, etc. For example, the recommendation model 332 and / or the multi-task transformer model 325 may generate a text prompt requesting more details (e.g., “Where are you planning to use the jacket?”), an image prompt (e.g., two jackets with a prompt to select a preferred jacket), an audio version of the text prompt, etc.
[0065] In further implementations, the recommendation model 332 may generate one or more future searches that the recommendation model 332 determines the user may searchPATENT APPLICATIONAttorney Docket No.: 31730 / 308583-00 next. For example, if the user search query 322 is “ski jacket”, the recommendation model 332 may additionally or alternatively generate one or more related potential future search recommendations (e.g., ski resort, gifts, travel packages, etc.). In some implementations, the recommendation model 332 determines such based on a confidence score and / or a similar prediction of whether a user will likely make such a query (e.g., based on data from other users (e.g., users that made similar queries)). For example, the recommendation model 332 may determine one or more complementary items to a user (e.g., shirts or shoes to match pants, destination attractions for a ticket to Hawaii, etc.) by determining a confidence score and / or similar prediction (e.g., based on general user trend data, past user searches, etc.).
[0066] In some implementations, the computing system 104 includes multiple recommendation models 332 and / or multi-task transformer models 325 trained for different categories of product, and the computing system 104 may direct the search to a particular model or models based on the responses to the prompts. In further implementations, the computing system 104 includes a single recommendation model 332 and / or multi-task transformer model 325, and the recommendation model 332 and / or the multi-task transformer model 325 makes the determination as described herein.
[0067] In some implementations, the recommendation model 332 may generate a summary using one or more semantic determinations (e.g., determining words similar to the search query and / or response) for generating a summary more easily understood by a user. Similarly, the recommendation model 332 may perform a behavior analysis using user responses to determine how to respond and / or summarize information for the user.
[0068] In some implementations, the recommendation model 332 may rank the received content items prior to generating the recommendation. In some such implementations, the recommendation model 332 generates the recommendation to include a predetermined number of content items with the highest ranking(s). For example, the recommendation model 332 may rank the content items based on relevance to the personalized retrieval queries 326, relevance to responses by the user, relevance to the initial search query, etc. In further implementations, the retrieval queries 326 may include one or more categories of items for content, and the recommendation model 332 may rank the content items based on and / or within a category. As such, the recommendation model 332 may reduce the number of relevant possibilities, improving the overall performance and enabling the overall agent models to properly handle the number of content items in a timelyPATENT APPLICATIONAttorney Docket No.: 31730 / 308583-00 manner. For example, the content items may be on the order of 1000, 10000, 100000, etc., which may be too many for the models to analyze in real time without reducing the sets.
[0069] It will be understood that, although FIG. 3 depicts example scenarios with particular components, a system may include additional, alternate, or fewer components and / or functionalities as those described above.
[0070] FIG. 4 depicts an example user interface (UI) 400 for providing recommendations associated with a search query (e.g., a search for content) to a user. Depending on the implementation, a computing system (e.g., computing system 104 of FIG. 1) may perform the actions and / or generate the outputs as described herein via a data extraction module (e.g., data extraction module 150 and / or 310), personalization module (e.g., personalization module 152 and / or 320), and / or recommendation module (e.g., recommendation module 154 and / or 330), before causing a client device (e.g., client device 102 of FIG. 1) to display the UI 400.
[0071] The client device 102 may display, to a user, a search query 405. In some implementations, the search query 405 may be input by the user into the input field 450, and the client device 102 may display the search query 405 exactly as the user inputs. In further implementations, the client device 102 may modify the search query 405 to better reflect the perceived goal of the user (e.g., changing the search query 405 of “ski jacket” into “Looking for a ski jacket”). In some such implementations, the client device 102 may modify the search query 405 over time as the user clarifies a purpose or goal. In still further implementations, the client device 102 and / or computing system 104 may analyze an input from the user and generate the search query 405 based on the input. For example, a user may input a brand name, a series of attributes, a description, etc. and the client device 102 and / or computing system 104 may determine that the user is looking for a particular product, service, document, etc.
[0072] The client device 102 may display, via the UI 400, one or more recommendations 430 for the user to review. In some implementations, the client device 102 automatically displays the recommendations 430 responsive to receiving and / or determining the search query 405. In some such implementations, the client device 102 generates the recommendations as described herein (e.g., with regard to FIGs. 3 and / or 5). In further such implementations, the client device 102 may display additional prompts 410 to prompt the user for additional information. For example, the client device 102 may generate additionalPATENT APPLICATIONAttorney Docket No.: 31730 / 308583-00 prompts 410 related to desirable attributes, preferred price range, preferred brands, etc. In such implementations, the client device 102 may update the recommendations 430 responsive to receiving a response 420 from the user.
[0073] In further implementations, the client device 102 may generate and display the prompts 410 automatically upon receiving the search query 405 but before generating the recommendations 430. In some such implementations, the client device 102 generates the additional prompts 410 responsive to determining that additional information is needed to provide a sufficiently relevant recommendation. In other such implementations, the client device 102 always generates at least a predetermined number of additional prompts 410 to receive a response 420 from the user before generating the recommendations 430.
[0074] In some implementations, the user may provide a response 420 to the additional prompts 410. Depending on the implementation, the response 420 may include additional attributes, additional questions, an indication of whether the recommendations 430 are relevant to the user, etc. In some implementations, the user may provide the response 420 before and / or without receiving a prompt 410. As such, the client device 102 may generate the additional prompts 410 responsive to receiving the response 420. Similarly, the client device 102 may modify the recommendations 430 based on what the user includes in the response 420.
[0075] Depending on the implementation, the recommendations 430 may include one or more recommendations for search results based on the search query 405, additional prompts 410, and / or responses 420. In further implementations, the recommendations 430 may additionally include a summary of why the recommendations 430 are provided. Depending on the implementation, the summary may include relevant attributes, related portions of the search query 405 and / or response 420, related brands, user search history results (e.g., as described with regard to FIGs. 3 and / or 5), etc. In further implementations, the recommendations 430 may include one or more recommendations for additional searches to perform related to the search query 405 and / or responses 420.
[0076] It will be understood that, although FIG. 4 depicts example scenarios with particular components, a system may include additional, alternate, or fewer components and / or functionalities as those described above.
[0077] FIG. 5 is a flow diagram of an example method 500 for determining and generating retrieval queries for a user. The method 500 may be implemented usingPATENT APPLICATIONAttorney Docket No.: 31730 / 308583-00 instructions stored on one or more non-transitory, computer-readable media (e.g., memory 144) that are executed by one or more processors in one or more computing devices. For example, the method 500 may be implemented by the processor 142 of the computing system 104 in FIG. 1, when executing instructions of the data extraction module 150, the personalization module 152, the recommendation module 154, and / or one or more other modules / components. It will be understood that additional, fewer, and / or alternate components may be used to implement the example method 500, and / or that the method 500 may include more or fewer blocks than shown (and / or in a different order than shown).
[0078] At block 502, the computing system 104 may receive a search query from a user associated with an account profile. In some implementations, the account profile includes a user history. In some implementations, the search query may be input by the user into a field (e.g., a search bar, a chat box window, etc.) directly. In further implementations, the client device 102 and / or computing system 104 may modify the search query as the user clarifies a purpose or goal. In still further implementations, the client device 102 and / or computing system 104 may analyze an input from the user and generate the search query based on the input. For example, a user may input a brand name, a series of attributes, a description, etc. and the client device 102 and / or computing system 104 may determine that the user is looking for a particular product, service, document, etc.
[0079] In some implementations, the search query may be a broad and / or vague search query. In such implementations, because the search query is broad and / or vague, a conventional model may generate hallucinations and / or unrelated or inaccurate results. For example, an initial search query of “ski jacket” as analyzed by a conventional model may generate incorrect results (e.g., a child ski jacket when the user is an adult) and / or hallucinations for products that do not exist. As such, the instant techniques may generate additional data based on the search query and / or other factors to improve the overall input into a model and generate more accurate and existent options.
[0080] At block 504, the computing system 104 may generate, based on the search query, a context attribute summary. In some implementations, the context attribute summary is indicative of a context associated with the search query (e.g., as received at block 502). In further such implementations, the context attribute summary may be predictive, and may include one or more predictions of future searches by and / or content items of interest for the user based on the search query. In some implementations, the computing system 104, as partPATENT APPLICATIONAttorney Docket No.: 31730 / 308583-00 of generating the context attribute summary, receives a plurality of search results associated with the search query (e.g., from a search model as detailed below with regard to block 510). The computing system 104 may then scrape (e.g., search through and gather or collate data associated with) one or more descriptions associated with the plurality of search results. The computing system 104 may then analyze the one or more descriptions to generate the context attribute summary, as described in more detail above.
[0081] In some implementations, the computing system 104 performs the scraping using a trained large language model (LLM). Depending on the implementation, the LLM is trained as described above (e.g., with regard to FIGs. 1-2B). In some implementations, the computing system 104, as part of the scraping, retrieves unstructured data associated with a description of at least one of the plurality of search results. The computing system 104 may then analyze the unstructured data to detect one or more salient words to generate the context attribute summary. Depending on the implementation, salient words may be or include keywords associated with the search query, words related to descriptive elements of an element (e.g., color, size, purpose, etc.), words that appear with a relatively higher frequency (e.g., words emphasized for SEP, words included in an item / page / resource title and a description associated with such, words that are repeated multiple within a description, etc.), and / or any other such measure of salience.
[0082] In further implementations, the computing system 104 additionally or alternatively, as part of the scraping, retrieves image data associated with at least one of the plurality of search results (e.g., of an item associated with the plurality of search results). The computing system 104 may then analyze, using an image analysis model, the image data to generate one or more visual details associated with the image data. The computing system 104 may then generate the context attribute summary based on the one or more visual details.
[0083] In still further implementations, the computing system 104 additionally or alternatively, as part of the scraping, retrieves review data associated with at least one of the plurality of search results (e.g., one or more user comments regarding a review of an item associated with the plurality of search results). The computing system 104 then analyzes the review data to generate the context attribute summary. For example, the computing system 104 may determine that one or more reviews include additional descriptions of an item and may use the data from the reviews to further generate attributes associated with the item for the context attribute summary.PATENT APPLICATIONAttorney Docket No.: 31730 / 308583-00
[0084] In some implementations, the context attribute summary includes one or more attributes associated with the search query (e.g., with one or more items associated with one or more search results associated with the search query). For example, as described in more detail above, the context attribute summary may include one or more descriptive attributes associated with an item (e.g., color, purpose, size, etc.). Similarly, the context attribute summary may include one or more brands, descriptions of brands, broad categories for items, etc. In further such implementations, the computing system 104 generates the retrieval queries to include at least some of the attributes, brands, etc.
[0085] At block 506, the computing system 104 may generate, based on the user history, a summarized user browsing history. In some implementations, the summarized user browsing history is indicative of one or more past searches performed by the user.Depending on the implementation, the summarized user browsing history includes one or more indications of similar searches, and may be used by the computing system 104 to eliminate past searches from recommendations (e.g., determining that previous searches were changed to move away from earlier searches), inform future recommendations (e.g., determining that a trend in previous searches indicates a line of inquiry the user is pursuing), etc. For example, an LLM may perform analysis on vector embeddings associated with past search queries to determine whether search queries have moved away from a topic (e.g., the analysis indicates increased distance in vector space), moved toward a topic (e.g., the analysis indicates decreased distance in vector space), oscillation around a topic (e.g., the analysis indicates a mix of increases and decreases in vector space, indicative of a user uncertainty or unfamiliarity regarding a topic), etc.
[0086] At block 508, the computing system 104 may generate, using the context attribute summary and the summarized user browsing history, a plurality of personalized retrieval queries. In some implementations, the computing system 104 generates the plurality of personalized retrieval queries using a trained language model with the context attribute summary and the summarized user browsing history as inputs. Depending on the implementation, the trained language model may be a multi-task transformer model (e.g., a model that leverages the transformer architecture to perform multiple tasks simultaneously).
[0087] At block 510, the computing system 104 may transmit the plurality of personalized retrieval queries to a search model to generate one or more recommendations associated with the search query for the user. Depending on the implementation, the searchPATENT APPLICATIONAttorney Docket No.: 31730 / 308583-00 model may be a model trained and / or configured to search through one or more information resources (e.g., from one or more third party servers, publisher databases, content databases, etc.). For example, the search model may be or include one or more algorithms to control crawlers or other bots, page ranking, transformers, etc. to guide a search using the generated retrieval queries.
[0088] In some implementations, as part of generating the one or more recommendations, the computing system 104 receives, from the search model, search results associated with the plurality of retrieval queries and generates, using a trained LLM (e.g., the same LLM as described with regard to block(s) 504 and / or 506, a different LLM, a copy of the same LLM, etc.), the one or more recommendations based on the received search results. In some implementations, the one or more recommendations includes a one or more content items associated with the search results, one or more recommended further searches, a summary for each of the one or more recommendations, and / or any other such element as described herein.
[0089] For example, the summary may include one or more detailed attributes determined to be relevant to the search query received from the user (e.g., as described above with regard to FIG. 3), an explanation of why an item or additional query is recommended (e.g., indicating that a user seems to be searching for a concept or item, and indicating that an additional search may be useful), etc. In some such implementations, the explanation may include an indication of which attributes are predicted to be desirable to a user and / or why.
[0090] In further implementations, the computing system 104 may additionally generate one or more prompts for additional information and / or may cause the prompt(s) to be displayed to the user. The user may input the initial search query responsive to the prompt, and / or may provide additional information responsive to the prompt. The computing system 104 may then generate and / or update the context attribute summary, retrieval queries, and / or one or more recommendations based on the additional information. For example, the computing system 104 may prompt a user to indicate whether the user is interested in professional or hobby-focused aspects of a field for which the user has performed one or more searches.
[0091] Artificial intelligence (Al) is a segment of computer science that focuses on the creation of models that can perform tasks with little to no human intervention. Artificial intelligence systems can utilize, for example, machine learning and computer vision.PATENT APPLICATIONAttorney Docket No.: 31730 / 308583-00Machine learning, and its subsets, such as deep learning, focus on developing models that can infer outputs from data. The outputs can include, for example, predictions and / or classifications. Computer vision focuses on analyzing and interpreting images and videos. Artificial intelligence systems can include generative models that generate new content in response to input prompts and / or based on other information.
[0092] Example machine-learned models include neural networks or other multi-layer non-linear models. Example neural networks include feed forward neural networks, deep neural networks, recurrent neural networks, and convolutional neural networks. Some example machine-learned models can leverage an attention mechanism such as self-attention. For example, some machine-learned models can include multi-headed self-attention models (e.g., transformer models).
[0093] The model(s) can be trained using various training or learning techniques. The training can implement supervised learning, unsupervised learning, reinforcement learning, etc. The training can use techniques such as, for example, backwards propagation of errors. For example, a loss function can be backpropagated through the model(s) to update one or more parameters of the model(s) (e.g., based on a gradient of the loss function). Various loss functions can be used such as mean squared error, likelihood loss, cross entropy loss, hinge loss, and / or various other loss functions. Gradient descent techniques can be used to iteratively update the parameters over a number of training iterations. A number of generalization techniques (e.g., weight decays, dropouts) can be used to improve the generalization capability of the models being trained.
[0094] The model(s) can be pre-trained before domain- specific alignment. For instance, a model can be pretrained over a general corpus of training data and fine-tuned on a more targeted corpus of training data. A model can be aligned using prompts that are designed to elicit domain- specific outputs. Prompts can be designed to include learned prompt values (e.g., soft prompts). The trained model(s) may be validated prior to their use using input data other than the training data and may be further updated or refined during their use based on additional feedback / inputs.
[0095] In some implementations, the computing system 104 may use one or more of the machine learning models noted above to perform any one or more of the operations discussed herein in connection with machine learning.PATENT APPLICATIONAttorney Docket No.: 31730 / 308583-00
[0096] The following list of examples reflects a variety of the embodiments explicitly contemplated by the present disclosure:
[0097] Example 1. A computer-implemented method for generating retrieval queries for a user, the computer-implemented method comprising: receiving, by one or more processors, a search query from a user associated with an account profile, the account profile including a user history; generating, by the one or more processors and based on the search query, a context attribute summary indicative of one or more content items of interest for the user; generating, by the one or more processors and based on the user history, a summarized user browsing history indicative of one or more past searches performed by the user; generating, by the one or more processors and using the context attribute summary and the summarized user browsing history as inputs to a trained language model, a plurality of retrieval queries; and transmitting, by the one or more processors, the plurality of retrieval queries to a search model to generate one or more recommendations associated with the search query for the user.
[0098] Example 2. The computer-implemented method of example 1, wherein the generating of the context attribute summary includes: receiving, by the one or more processors, a plurality of search results associated with the search query; scraping, by the one or more processors, one or more descriptions associated with the plurality of search results; and analyzing, by the one or more processors, the one or more descriptions to generate the context attribute summary.
[0099] Example 3. The computer-implemented method of example 2, wherein the scraping is performed by a trained large language model.
[0100] Example 4. The computer-implemented method of either one of examples 2 or 3, wherein the scraping includes: retrieving, by the one or more processors, unstructured data associated with a description of at least one of the plurality of search results; and analyzing, by the one or more processors, the unstructured data to detect one or more salient words to generate the context attribute summary.
[0101] Example 5. The computer-implemented method of any one of examples 2-4, wherein the scraping includes: retrieving, by the one or more processors, image data associated with at least one of the plurality of search results; analyzing, by the one or more processors and using an image analysis model, the image data to generate one or more visualPATENT APPLICATIONAttorney Docket No.: 31730 / 308583-00 details associated with the image data; and generating, by the one or more processors, the context attribute summary based on the one or more visual details.
[0102] Example 6. The computer- implemented method of any one of examples 2-5, wherein the scraping includes: retrieving, by the one or more processors, review data associated with at least one of the plurality of search results; and analyzing, by the one or more processors, the review data to generate the context attribute summary.
[0103] Example 7. The computer- implemented method of any one of examples 1-6, wherein the trained language model is a multi-task transformer model.
[0104] Example 8. The computer- implemented method of any one of examples 1-6, wherein the context attribute summary includes at least one or more attributes associated with the search query.
[0105] Example 9. The computer- implemented method of example 8, wherein the context attribute summary additionally includes at least one or more brands associated with the search query.
[0106] Example 10. The computer-implemented method of example 9, wherein the plurality of retrieval queries includes at least a combination of the at least one or more attributes and the at least one or more brands.
[0107] Example 11. The computer-implemented method of example 1, wherein generating the one or more recommendations includes: receiving, by the one or more processors and from the search model, search results associated with the plurality of retrieval queries; and generating, by the one or more processors and using a trained large language model, the one or more recommendations based on the received search results.
[0108] Example 12. The computer-implemented method of example 11, wherein generating the one or more recommendations includes: generating, by the one or more processors, a summary for each of the one or more recommendations.
[0109] Example 13. The computer-implemented method of example 12, wherein the summary includes one or more detailed attributes associated with a corresponding recommendation of the one or more recommendations.
[0110] Example 14. The computer-implemented method of either one of examples 12 or 13, wherein the summary includes an explanation for a corresponding recommendation ofPATENT APPLICATIONAttorney Docket No.: 31730 / 308583-00 the one or more recommendations, the explanation including an indication of which attributes of the corresponding recommendation are predicted to be desirable to the user.
[0111] Example 15. The computer-implemented method of any one of the preceding examples, further comprising: generating, by the one or more processors, one or more prompts for additional information to display to the user; and updating, by the one or more processors, the context attribute summary based on the additional information.
[0112] Example 16. An apparatus, functioning as a server device, including: a transceiver; and one or more processors configured to perform a method according to any one of the preceding examples.
[0113] Although the foregoing text sets forth a detailed description of numerous different aspects and implementations of the invention, it should be understood that the scope of the patent is defined by the words of the claims set forth at the end of this patent. The detailed description is to be construed as exemplary only.
[0114] The following additional considerations apply to the foregoing discussion. Throughout this specification, plural instances may implement components, operations, or structures described as a single instance. Although individual operations of one or more methods are illustrated and described as separate operations, one or more of the individual operations may be performed concurrently, and nothing requires that the operations be performed in the order illustrated. Structures and functionality presented as separate components in example configurations may be implemented as a combined structure or component. Similarly, structures and functionality presented as a single component may be implemented as separate components. These and other variations, modifications, additions, and improvements fall within the scope of the subject matter of the present disclosure.
[0115] Unless specifically stated otherwise, discussions in the present disclosure using words such as “processing,” “computing,” “calculating,” “determining,” “presenting,” “displaying,” or the like may refer to actions or processes of a machine (e.g., a computer) that manipulates or transforms data represented as physical (e.g., electronic, magnetic, or optical) quantities within one or more memories (e.g., volatile memory, non-volatile memory, or a combination thereof), registers, or other machine components that receive, store, transmit, or display information.
[0116] As used in the present disclosure any reference to “one implementation” or “an implementation” means that a particular element, feature, structure, or characteristicPATENT APPLICATIONAttorney Docket No.: 31730 / 308583-00 described in connection with the implementation is included in at least one implementation or implementation. The appearances of the phrase “in one implementation” in various places in the specification are not necessarily all referring to the same implementation.
[0117] As used in the present disclosure, the terms “comprises,” “comprising,” “includes,” “including,” “has,” “having” or any other variation thereof, are intended to cover a non-exclusive inclusion. For example, a process, method, article, or apparatus that comprises a list of elements is not necessarily limited to only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Further, unless expressly stated to the contrary, “or” refers to an inclusive or and not to an exclusive or. For example, a condition A or B is satisfied by any one of the following: A is true (or present), and B is false (or not present), A is false (or not present) and B is true (or present), and both A and B are true (or present).
[0118] Unless otherwise apparent from the context of use, reference in the present disclosure to a same set of “one or more processors” (or a same “plurality of processors,” etc.) performing multiple operations can encompass implementations in which performance of the operations is divided among the processor(s) in any suitable way. For example, “generating, by one or more processors, X; and generating, by the one or more processors, Y” can encompass: (1) implementations in which a first subset of the processors (e.g., in a first computing device) generates X and an entirely distinct, second subset of the processors (e.g., in a different, second computing device) independently generates Y ; (2) implementations in which one or more or all of the processor(s) (e.g., one or multiple processors in the same device, or multiple processors distributed among multiple devices) contribute to the generation of X and / or Y ; and (3) other variations.
[0119] Upon reading this disclosure, those of skill in the art will appreciate still additional alternative structural and functional designs through the principles described herein. Thus, while particular implementations and applications have been illustrated and described, it is to be understood that the disclosed implementations are not limited to the precise construction and components disclosed in the present disclosure. Various modifications, changes, and variations, which will be apparent to those skilled in the art, may be made in the arrangement, operation and details of the method and apparatus disclosed in the present disclosure without departing from the spirit and scope defined in the appended claims.
Claims
PATENT APPLICATIONAttorney Docket No.: 31730 / 308583-00What is claimed is:
1. A computer- implemented method for generating retrieval queries for a user, the computer-implemented method comprising: receiving, by one or more processors, a search query from a user associated with an account profile, the account profile including a user history; generating, by the one or more processors and based on the search query, a context attribute summary indicative of at least one of (i) one or more content items of interest for the user or (ii) predictions for future searches by the user; generating, by the one or more processors and based on the user history, a summarized user browsing history indicative of one or more past searches performed by the user; generating, by the one or more processors and using the context attribute summary and the summarized user browsing history as inputs to a trained language model, a plurality of retrieval queries; and transmitting, by the one or more processors, the plurality of retrieval queries to a search model to generate one or more recommendations associated with the search query for the user.
2. The computer- implemented method of claim 1, wherein the generating of the context attribute summary includes: receiving, by the one or more processors, a plurality of search results associated with the search query; scraping, by the one or more processors, one or more descriptions associated with the plurality of search results; and analyzing, by the one or more processors, the one or more descriptions to generate the context attribute summary.
3. The computer- implemented method of claim 2, wherein the scraping is performed by a trained large language model.PATENT APPLICATIONAttorney Docket No.: 31730 / 308583-004. The computer-implemented method of either one of claims 2 or 3, wherein the scraping includes: retrieving, by the one or more processors, unstructured data associated with a description of at least one of the plurality of search results; and analyzing, by the one or more processors, the unstructured data to detect one or more salient words to generate the context attribute summary.
5. The computer-implemented method of any one of claims 2-4, wherein the scraping includes: retrieving, by the one or more processors, image data associated with at least one of the plurality of search results; analyzing, by the one or more processors and using an image analysis model, the image data to generate one or more visual details associated with the image data; and generating, by the one or more processors, the context attribute summary based on the one or more visual details.
6. The computer-implemented method of any one of claims 2-5, wherein the scraping includes: retrieving, by the one or more processors, review data associated with at least one of the plurality of search results; and analyzing, by the one or more processors, the review data to generate the context attribute summary.
7. The computer- implemented method of any one of claims 1-6, wherein the trained language model is a multi-task transformer model.
8. The computer- implemented method of any one of claims 1-6, wherein the context attribute summary includes at least one or more attributes associated with the search query.
9. The computer- implemented method of claim 8, wherein the context attribute summary additionally includes at least one or more brands associated with the search query.PATENT APPLICATION Attorney Docket No.: 31730 / 308583-0010. The computer-implemented method of claim 9, wherein the plurality of retrieval queries includes at least a combination of the at least one or more attributes and the at least one or more brands.
11. The computer-implemented method of claim 1, wherein generating the one or more recommendations includes: receiving, by the one or more processors and from the search model, search results associated with the plurality of retrieval queries; and generating, by the one or more processors and using a trained large language model, the one or more recommendations based on the received search results.
12. The computer-implemented method of claim 11, wherein generating the one or more recommendations includes: generating, by the one or more processors, a summary for each of the one or more recommendations .
13. The computer- implemented method of claim 12, wherein the summary includes one or more detailed attributes associated with a corresponding recommendation of the one or more recommendations.
14. The computer-implemented method of either one of claims 12 or 13, wherein the summary includes an explanation for a corresponding recommendation of the one or more recommendations, the explanation including an indication of which attributes of the corresponding recommendation are predicted to be desirable to the user.
15. The computer-implemented method of any one of the preceding claims, further comprising: generating, by the one or more processors, one or more prompts for additional information to display to the user; and updating, by the one or more processors, the context attribute summary based on the additional information.
16. An apparatus, functioning as a server device, including:PATENT APPLICATION Attorney Docket No.: 31730 / 308583-00 a transceiver; and one or more processors configured to perform a method according to any one of the preceding claims.