Human search method, device, equipment, storage medium and program product
By using a large language model for persona retrieval and employing key tags and features to filter candidate personas, the problem of low efficiency in manual retrieval is solved, achieving efficient and accurate persona retrieval.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- BEIJING QIYI CENTURY SCI & TECH CO LTD
- Filing Date
- 2026-02-26
- Publication Date
- 2026-06-30
Smart Images

Figure CN122309722A_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of computer application technology, and in particular to a persona retrieval method, apparatus, device, storage medium, and program product. Background Technology
[0002] In film, television, gaming, and literary creation, character design is one of the core elements of a work, directly impacting its appeal and market acceptance. When conceiving new characters, creative teams (such as screenwriters, directors, and producers) often refer to classic characters from existing works for inspiration, market positioning, or to ensure the character's uniqueness. For example, a screenwriter might want to create a "futuristic anti-hero with Shakespearean tragic elements," or a casting director might need to find an actor whose temperament matches the "cold exterior, warm interior detective" character in the script.
[0003] Currently, achieving the above goals mainly relies on manual retrieval and analysis, that is, professional personnel such as script analysts and planners manually read a large number of scripts, novels and character introductions, and compare and screen them based on personal experience and memory.
[0004] This manual retrieval method requires a significant investment of manpower and time, making it difficult to meet the rapid retrieval needs of a massive script library. This is especially true for large film and television platforms, where the operating costs are high and the retrieval efficiency is low. Moreover, the retrieval results are greatly influenced by individual understanding, experience, and preferences. Different people may have different judgments on the same character, leading to unstable and inconsistent retrieval results and poor retrieval accuracy. Summary of the Invention
[0005] The purpose of this application is to provide a persona retrieval method, apparatus, device, storage medium, and program product, which can reduce operating costs and improve the efficiency and accuracy of persona retrieval. The specific technical solution is as follows: Firstly, a persona retrieval method is provided, including: Receive a search request for a target persona, the search request carrying textual description information of the target persona; The first language model is used to perform semantic understanding on the text description information of the target persona, and the key tags and persona features of the target persona output by the first language model are obtained. Based on the key tags of the target persona and the tag information corresponding to each known persona in the pre-built retrieval database, candidate personas are selected from the known personas; Based on the personality characteristics of the target persona and the characteristic information corresponding to each candidate persona in the retrieval database, a reference persona is determined from the candidate personas; The second language model is used to compare and analyze the text description information corresponding to each reference persona in the retrieval database with the text description information of the target persona, and the retrieval results output by the second language model are obtained. The retrieval results include the reasons for the similarity between the reference persona and the target persona.
[0006] Optionally, determining a reference persona from the candidate persons based on the persona characteristics of the target persona and the characteristic information corresponding to each candidate persona in the retrieval database includes: Based on the personality characteristics of the target persona and the characteristic information corresponding to each candidate persona in the retrieval database, the similarity between the target persona and each candidate persona is determined. Based on the aforementioned similarity, a reference persona is determined among the candidate personas.
[0007] Optionally, before determining the similarity between the target persona and each candidate persona based on the persona features of the target persona and the feature information corresponding to each candidate persona in the retrieval database, the method further includes: Convert the personality characteristics of the target persona into a query vector; Convert the feature information corresponding to each candidate into a feature vector; The determination of the similarity between the target persona and each candidate persona, based on the persona characteristics of the target persona and the characteristic information corresponding to each candidate persona in the retrieval database, includes: Based on the query vector corresponding to the target persona and the feature vector corresponding to each candidate persona, the similarity between the target persona and each candidate persona is determined.
[0008] Optionally, before determining the similarity between the target persona and each candidate persona based on the persona features of the target persona and the feature information corresponding to each candidate persona in the retrieval database, the method further includes: Convert the personality characteristics of the target persona into a query vector; The feature information corresponding to each candidate persona includes a feature vector. The determination of the similarity between the target persona and each candidate persona, based on the persona features of the target persona and the feature information corresponding to each candidate persona in the retrieval database, includes: Based on the query vector corresponding to the target persona and the feature vector corresponding to each candidate persona, the similarity between the target persona and each candidate persona is determined.
[0009] Optionally, based on the similarity, a reference persona is determined among the candidate personas, including: The top N candidate personas with the highest similarity to the target persona are determined as reference personas, where N is an integer greater than or equal to 1; Alternatively, candidate personas with a similarity greater than or equal to the similarity threshold with the target persona can be identified as reference personas.
[0010] Optionally, the step of filtering candidate personas from known personas based on the key tags of the target persona and the tag information corresponding to each known persona in a pre-built retrieval database includes: For each known persona, if the tag information corresponding to the known persona in the pre-constructed retrieval database includes at least M key tags of the target persona, then the known persona is determined as a candidate persona, where M is an integer greater than or equal to 1.
[0011] Optionally, the retrieval database is constructed through the following steps: Determine the key character designs from the script library; Extract the text description information corresponding to each important character; Based on the text description information corresponding to each important persona, determine the tag information and feature information corresponding to each important persona; Each important persona is treated as a known persona, and the corresponding text description information, tag information, and feature information are stored in the retrieval database.
[0012] Optionally, the search results may also include at least one of the following: The ranking of the reference characters; The name of the reference character; The similarity between the reference persona and the target persona.
[0013] Secondly, a persona retrieval device is provided, comprising: The request receiving module is used to receive a search request for a target persona, wherein the search request carries textual description information of the target persona; The first acquisition module is used to perform semantic understanding on the text description information of the target persona using the first large language model, and obtain the key tags and persona features of the target persona output by the first large language model; The persona filtering module is used to filter out candidate personas from known personas based on the key tags of the target persona and the tag information corresponding to each known persona in a pre-built retrieval database. The character designation module is used to determine a reference character from the candidate characters based on the character characteristics of the target character and the characteristic information corresponding to each candidate character in the retrieval database. The second acquisition module is used to compare and analyze the text description information corresponding to each reference persona in the retrieval database with the text description information of the target persona using the second large language model, and obtain the retrieval results output by the second large language model. The retrieval results include the reasons for the similarity between the reference persona and the target persona.
[0014] Thirdly, an electronic device is provided, including a processor, a communication interface, a memory, and a communication bus, wherein the processor, the communication interface, and the memory communicate with each other through the communication bus; Memory, used to store computer programs; When a processor executes a program stored in memory, it implements the steps of the persona retrieval method described in the first aspect.
[0015] Fourthly, a computer-readable storage medium is provided, on which a computer program is stored, characterized in that, when the program is executed by a processor, it implements the steps of the persona retrieval method described in the first aspect.
[0016] Fifthly, a computer program product is provided, the computer program product including computer instructions stored in a computer-readable storage medium and adapted to be read and executed by a processor to cause an electronic device having the processor to perform the steps of the persona retrieval method described in the first aspect.
[0017] By applying the technical solution provided in the embodiments of this application, after receiving a retrieval request for a target persona, the first large language model is used to perform semantic understanding on the textual description information of the target persona carried in the retrieval request, obtaining the key tags and persona features of the target persona output by the first large language model. Based on the key tags of the target persona, known personas in the retrieval database are searched to filter out candidate personas. Then, based on the persona features of the target persona, reference personas are determined from the candidate personas. Finally, the second large language model is used to compare and analyze the textual description information corresponding to each reference persona with the textual description information of the target persona, obtaining the retrieval results output by the second large language model. By using large language models at both ends of the retrieval process, deep semantic understanding and credible interpretation of personas are achieved, realizing end-to-end intelligent persona retrieval, reducing manual intervention, lowering operating costs, and improving the efficiency and accuracy of persona retrieval. Attached Figure Description
[0018] To more clearly illustrate the technical solutions in the embodiments of this application or the prior art, the accompanying drawings used in the description of the embodiments or the prior art will be briefly introduced below.
[0019] Figure 1 This is a flowchart illustrating one implementation of the persona retrieval method in this application. Figure 2 This is a flowchart illustrating another implementation of the persona retrieval method in this application. Figure 3 This is a schematic diagram of the system architecture applicable to the persona retrieval method in the embodiments of this application; Figure 4 This is a schematic diagram of the structure of a persona retrieval device in an embodiment of this application; Figure 5 This is a schematic diagram of the structure of an electronic device according to an embodiment of this application. Detailed Implementation
[0020] The technical solutions in the embodiments of this application will now be described with reference to the accompanying drawings.
[0021] See Figure 1 The diagram shown is an implementation flowchart of a persona retrieval method provided in this application embodiment. The method may include the following steps: S110: Receive a search request targeting the desired persona.
[0022] The search request carries a textual description of the target persona.
[0023] In this application embodiment, the target persona can be understood as the persona that the user wants to conceive. Persona is short for character design, a complex, abstract, and multi-dimensional concept encompassing rich connotations such as personality, motivation, values, behavioral patterns, and interpersonal relationships. The user can be understood as members of the creative team, such as screenwriters, directors, and producers, or other individuals with a persona retrieval need.
[0024] Users can submit search requests for specific character types when they have a need for such searches. For example, in film and television scriptwriting and planning, screenwriters may search for target character types; casting directors may search for target character types in casting and role matching; game designers may search for target character types in game character design and plot generation; and novelists may search for target character types in literary creation.
[0025] Search requests targeting a specific persona can include a textual description of that persona. This textual description can be understood as information describing the persona using natural language, such as, "A misanthropic but brilliant doctor who uses acerbic sarcasm to mask his kindness."
[0026] Optionally, users can also upload multimodal information such as files, images, audio or video clips, which can be converted into text description information after being received.
[0027] Once a search request targeting a specific persona is received, subsequent steps can be performed.
[0028] S120: Utilize the first major language model to perform semantic understanding on the textual description information of the target persona, and obtain the key labels and persona features of the target persona output by the first major language model.
[0029] In this embodiment, a first large language model (LLM) can be pre-trained. This LLM is trained or guided to act as a senior script analyst, whose task is to extract two types of information from the input text: Key tags: Words that best summarize the core characteristics of a persona, such as "doctor," "world-weary," "exceptionally talented," "cold on the outside but warm on the inside," and "kind." These key tags are important indexes for subsequent searches; Character traits: A more complete and detailed structured description of the character, such as the core conflict, behavior patterns, and underlying motivations.
[0030] Upon receiving a search request for a target persona, the system can obtain the textual description of the target persona carried in the search request. The textual description of the target persona and the first prompt text can be input into the first language model. The first prompt text guides the first language model to perform semantic understanding of the textual description of the target persona, and obtain the key tags and persona features of the target persona output by the first language model.
[0031] The purpose of the first prompt text is to guide the first language model to semantically understand the textual description information of the target persona, and to standardize the output of the key tags and persona features of the target persona according to the preset structured format.
[0032] S130: Based on the key tags of the target persona and the tag information corresponding to each known persona in the pre-built retrieval database, select candidate personas from the known personas.
[0033] In this embodiment of the application, a retrieval database can be pre-built, which stores relevant information corresponding to known personas, such as tag information, feature information, and text description information.
[0034] By using the first major language model to perform semantic understanding on the textual description information of the target persona, and obtaining the key tags and persona features of the target persona output by the first major language model, the key tags of the target persona can be compared with the tag information corresponding to each known persona in the retrieval database. Based on the comparison results, candidate personas can be selected from the known personas.
[0035] S140: Based on the personality characteristics of the target persona and the characteristic information corresponding to each candidate persona in the retrieval database, determine the reference persona among the candidate personas.
[0036] The first language model is used to perform semantic understanding on the text description information of the target persona, and the key tags and persona features of the target persona output by the first language model are obtained. Based on the key tags of the target persona and the tag information corresponding to each known persona in the retrieval database, candidate personas are selected from the known personas. Then, the persona features of the target persona can be compared with the feature information corresponding to each candidate persona in the retrieval database. Based on the comparison results, the reference persona is determined from the candidate personas.
[0037] Understandably, tag comparison involves less computation, while feature comparison involves more. First, the key tags of the target persona are compared with the tag information corresponding to each known persona in the retrieval database to filter out candidate personas. The number of candidate personas filtered out is less than the number of known personas. Then, the persona features of the target persona are compared with the feature information corresponding to each candidate persona to determine the reference persona. This can reduce the computational workload of feature comparison and improve processing efficiency.
[0038] S150: Using the second largest language model, compare and analyze the text description information corresponding to each reference persona in the retrieval database with the text description information of the target persona, and obtain the retrieval results output by the second largest language model. The retrieval results include the reasons for the similarity between the reference persona and the target persona.
[0039] In this embodiment of the application, a second language model can be pre-trained. The task of the second language model is to perform comparative analysis on the input text and generate a detailed search result in natural language.
[0040] Once the reference persona is determined, the text description information corresponding to each reference persona, the text description information of the target persona, and the second prompt text can be input into the second language model. The second prompt text guides the second language model to compare and analyze the text description information corresponding to the reference persona and the text description information of the target persona, and obtain the search results output by the second language model. The search results can include the reasons for the similarity between the reference persona and the target persona.
[0041] The purpose of the second prompt text is to guide the second language model to compare and analyze the textual description information corresponding to the reference persona and the textual description information of the target persona, and to standardize the second language model to output the search results in a preset structured format.
[0042] Furthermore, the search results can be output so that the party initiating the search request can view the results.
[0043] Search results can be output through search reports, which means responding to search requests by providing search reports, thereby improving the credibility and usability of search results.
[0044] For example, one possible search result is as follows: Reference persona B is highly similar to the target persona you described. The reason for this similarity is: 1. Same professional background (both are doctors); 2. The core personality traits match. For example, character B shows a high degree of responsibility towards patients in many plot points (corresponding to "kindness"), but his dialogue style is sarcastic and mean (corresponding to "misanthropy" and "sarcasm").
[0045] In this embodiment of the application, a complete closed loop of "understanding-quantification-interpretation" is used to perfectly combine human intuition with machine computing power, thereby achieving efficient, in-depth and interpretable retrieval of personas.
[0046] By applying the method provided in the embodiments of this application, a large language model is used at both ends of the retrieval process to achieve deep semantic understanding and credible interpretation of the persona, realize end-to-end intelligent persona retrieval, reduce manual intervention, lower operating costs, and improve the efficiency and accuracy of persona retrieval.
[0047] In some embodiments of this application, step S140, based on the personality characteristics of the target personality and the characteristic information corresponding to each candidate personality in the retrieval database, determines the reference personality among the candidate personalities, and may include the following steps: Based on the characteristics of the target persona and the characteristic information corresponding to each candidate persona in the retrieval database, the similarity between the target persona and each candidate persona is determined. Based on similarity, a reference persona is determined from the candidate personas.
[0048] For ease of description, the above steps will be combined for explanation.
[0049] In this embodiment, after semantic understanding of the textual description information of the target persona using a first language model to obtain the key tags and persona features output by the first language model, and after filtering candidate persons from the known persons based on the key tags of the target persona and the tag information corresponding to each known persona in the pre-built retrieval database, the persona features of the target persona can be compared with the feature information corresponding to each candidate persona in the retrieval database to determine the similarity between the target persona and each candidate persona. Then, based on the similarity, a reference persona is determined from the candidate persons. For any candidate persona, the higher the similarity between the target persona and the candidate persona, the closer the features of the target persona and the candidate persona are, and the more similar their intrinsic core traits, narrative functions, or artistic expressions are.
[0050] By identifying reference characters among the candidates based on the similarity between the target character and each candidate character, the accuracy of character retrieval can be improved.
[0051] In some embodiments of this application, before determining the similarity between the target persona and each candidate persona based on the persona characteristics of the target persona and the feature information corresponding to each candidate persona in the retrieval database, the method further includes: Convert the target persona's personality traits into query vectors; Convert the feature information corresponding to each candidate into a feature vector; Based on the target persona's characteristics and the corresponding characteristic information of each candidate persona in the retrieval database, the similarity between the target persona and each candidate persona is determined, including: Based on the query vector corresponding to the target persona and the feature vector corresponding to each candidate persona, the similarity between the target persona and each candidate persona is determined.
[0052] In this embodiment of the application, after semantic understanding of the textual description information of the target persona using the first language model and obtaining the key tags and persona features of the target persona output by the first language model, the persona features of the target persona can be converted into a query vector.
[0053] Alternatively, the persona features of the target persona can be converted into query vectors using a pre-trained text embedding model, such as Bidirectional Encoder Representations from Transformers (BERT) (Sentence-BERT, S-BERT).
[0054] Based on the key tags of the target persona and the tag information corresponding to each known persona in the pre-built retrieval database, after filtering out candidate personas from the known personas, the feature information corresponding to each candidate persona can be converted into feature vectors.
[0055] Alternatively, a text embedding model can be used to convert the feature information corresponding to each candidate setting into a feature vector.
[0056] Based on the query vector corresponding to the target persona and the feature vector corresponding to each candidate persona, the similarity between the target persona and each candidate persona can be determined.
[0057] Optionally, for each candidate persona, the cosine similarity between the query vector corresponding to the target persona and the feature vector corresponding to the candidate persona can be calculated. That is, the cosine similarity algorithm is used to measure the similarity between the feature vectors of the two personas, and the cosine similarity is determined as the similarity between the target persona and the candidate persona.
[0058] Cosine similarity measures the directional similarity between two vectors by calculating the cosine of the angle between them in a multidimensional space. Its range is [-1, 1], and the closer the value is to 1, the more similar the two vectors are.
[0059] The cosine similarity between two vectors can be calculated using the following formula:
[0060] in, This represents the query vector corresponding to the target persona. This represents the feature vector corresponding to a candidate schema retrieved from the database, where n is the dimension of the vector.
[0061] Text embedding technology maps text information to a high-dimensional mathematical vector. In this vector space, texts with similar meanings have vectors that are spatially closer. By comparing the query vector corresponding to the target persona with the feature vectors corresponding to each candidate persona, the similarity between the target persona and each candidate persona can be accurately quantified.
[0062] In some embodiments of this application, before determining the similarity between the target persona and each candidate persona based on the persona characteristics of the target persona and the feature information corresponding to each candidate persona in the retrieval database, the method may further include the following steps: Convert the target persona's personality traits into query vectors; The feature information corresponding to each candidate persona includes a feature vector. Based on the persona features of the target persona and the feature information corresponding to each candidate persona retrieved from the database, the similarity between the target persona and each candidate persona is determined, including: Based on the query vector corresponding to the target persona and the feature vector corresponding to each candidate persona, the similarity between the target persona and each candidate persona is determined.
[0063] In this embodiment of the application, the feature information corresponding to each known persona in the retrieval database includes a feature vector, that is, the feature information corresponding to each known persona has been converted into a feature vector when the retrieval database is constructed.
[0064] After using the first major language model to perform semantic understanding on the textual description information of the target persona and obtaining the key tags and persona features of the target persona output by the first major language model, the persona features of the target persona can be converted into query vectors.
[0065] Alternatively, a pre-trained text embedding model can be used to convert the personality features of the target persona into a query vector.
[0066] After filtering out candidate personas from the known personas based on the key tags of the target persona and the tag information corresponding to each known persona in the pre-built retrieval database, the feature vector corresponding to each candidate persona can be obtained from the retrieval database, which helps to save processing time.
[0067] Based on the query vector corresponding to the target persona and the feature vector corresponding to each candidate persona, the similarity between the target persona and each candidate persona can be determined.
[0068] Optionally, for each candidate persona, the cosine similarity between the query vector corresponding to the target persona and the feature vector corresponding to the candidate persona can be calculated. That is, the cosine similarity algorithm is used to measure the similarity between the feature vectors of the two personas, and the cosine similarity is determined as the similarity between the target persona and the candidate persona.
[0069] By using the query vector corresponding to the target persona and the feature vector corresponding to each candidate persona, the similarity between the target persona and each candidate persona can be accurately quantified.
[0070] In some embodiments of this application, determining a reference persona from candidate persons based on similarity may include the following steps: The top N candidate personas with the highest similarity to the target persona are identified as reference personas, where N is an integer greater than or equal to 1. Alternatively, candidate personas with a similarity to the target persona greater than or equal to the similarity threshold can be identified as reference personas.
[0071] In this embodiment, after determining the similarity between the target persona and each candidate persona based on the persona characteristics of the target persona and the corresponding characteristic information of each candidate persona in the retrieval database, the top N candidate personas with the highest similarity to the target persona can be identified. These N candidate personas, which have a high similarity to the target persona, can be determined as reference personas. N is an integer greater than or equal to 1, and can be set and adjusted according to the actual situation, such as being set to 5.
[0072] Alternatively, based on the target persona's characteristics and the corresponding characteristic information of each candidate persona in the search database, after determining the similarity between the target persona and each candidate persona, candidate personas with a similarity greater than or equal to a similarity threshold can be identified. These candidate personas, with high similarity to the target persona, can be designated as reference personas. The similarity threshold can be set and adjusted according to the actual situation, such as setting it to 80%.
[0073] Based on the similarity between the target persona and the candidate personas, the top N candidate personas with the highest similarity or those with a similarity greater than or equal to the similarity threshold are identified as reference personas. This can ensure that the reference personas have a high similarity to the target persona, which helps improve the accuracy of persona retrieval.
[0074] In some embodiments of this application, step S130 filters candidate personas from known personas based on the key tags of the target persona and the tag information corresponding to each known persona in a pre-built retrieval database, including: For each known persona, if the tag information corresponding to the known persona in the pre-built retrieval database includes at least M key tags of the target persona, then the known persona is determined as a candidate persona, where M is an integer greater than or equal to 1.
[0075] In this embodiment, the retrieval database stores tag information corresponding to multiple known personas. After semantically understanding the textual description information of the target persona using the first language model to obtain the key tags and persona features of the target persona output by the first language model, for each known persona in the retrieval database, the tag information corresponding to the known persona can be compared with the key tags of the target persona. If the tag information corresponding to the known persona includes at least M key tags of the target persona, then the tag information corresponding to the known persona is considered to be similar to the key tags of the target persona, and the known persona can be identified as a candidate persona. M is an integer greater than or equal to 1, such as 2.
[0076] For any known persona, if the tag information corresponding to the known persona includes at least M key tags of the target persona, the known persona is identified as a candidate persona, and then a reference persona is identified from the candidate personas, which can improve the accuracy of persona retrieval.
[0077] In some embodiments of this application, the retrieval database can be constructed through the following steps: Determine the key character designs from the script library; Extract the text description information corresponding to each important character; Based on the text description information corresponding to each important persona, determine the tag information and feature information corresponding to each important persona; Each important persona is treated as a known persona, and the corresponding text description information, tag information, and feature information are stored in the retrieval database.
[0078] For ease of description, the above steps will be combined for explanation.
[0079] In this embodiment, important characters in the script library can be identified first. Optionally, the main characters of each script in the script library can be identified as important characters, or important characters in the script library can be identified through manual selection.
[0080] It can extract textual descriptions of each important character, such as character biographies, script summaries, core plot points, key lines, character names, and source works.
[0081] Based on the text description information corresponding to each important character, the tag information and feature information corresponding to each important character can be determined. Optionally, for each important character in the script library, the text description information and the first prompt text corresponding to that important character can be input into the first language model. The first prompt text guides the first language model to perform semantic understanding of the text description information corresponding to that important character, and obtain the tag information and feature information corresponding to that important character output by the first language model.
[0082] After obtaining the tag information and feature information corresponding to each important persona, each important persona can be further treated as a known persona, and the text description information, tag information and feature information corresponding to each known persona can be stored in the retrieval database.
[0083] Optionally, the feature information corresponding to each known persona may include a feature vector.
[0084] Optionally, the retrieval database may include a text database and a vector database. The text database is used to store text description information, tag information, and feature information corresponding to each known persona, while the vector database is used to store feature vectors corresponding to each known persona.
[0085] Optionally, the tag information corresponding to each known persona in the database can be updated according to update instructions, such as adding, deleting, or modifying.
[0086] The above steps can be used to build a retrieval database in advance, providing known character information for subsequent character retrieval.
[0087] In some embodiments of this application, the search results may also include at least one of the following: Ranking based on character design; Reference character design name; Similarity between the reference persona and the target persona.
[0088] In this embodiment of the application, the second language model is used to compare and analyze the text description information corresponding to each reference persona in the retrieval database with the text description information of the target persona, and obtain the retrieval results output by the second language model. The retrieval results may include the reasons for the similarity between the reference persona and the target persona, and may also include at least one of the following: the ranking of the reference persona, the persona name of the reference persona, and the similarity between the reference persona and the target persona.
[0089] The ranking of reference characters can be obtained by sorting them according to the similarity between the reference character and the target character.
[0090] Optionally, the search results can be in list form, with each part of the list corresponding to the analysis results of a reference persona.
[0091] For example, the ranking of character A is: 1; Reference character design A's character design name: 《XXX》A; Similarity between reference persona A and target persona: 93%; Reasons for the similarity between reference character A and the target character: A and the target character are highly similar. Both are geniuses with extraordinary observational skills, which they use to solve cold cases. A also exhibits obvious social withdrawal and post-traumatic stress disorder characteristics, consistent with the description of "questionable emotional intelligence." Their motivation for solving cases stems more from personal psychological drive than from a sense of professional responsibility, which is also highly consistent.
[0092] For ease of understanding, the technical solutions provided in the embodiments of this application will be described below through specific examples.
[0093] First, prepare the offline data.
[0094] Before providing services, the script library of the content platform needs to be preprocessed. For each important character in the script library, extract textual descriptions such as character biography, core plot, key lines, character name, and source works to form a detailed "character profile." Based on the textual descriptions of each important character, determine the corresponding tag information and feature information. Optionally, a text embedding model (such as BERT or a robustly optimized BERT approach (RoBERTa)) can be used to convert the feature information of each important character into a feature vector, i.e., a high-dimensional vector (e.g., 768-dimensional), and store it in a vector database (such as Milvus (an open-source vector database), Facebook AI Similarity Search (FAISS), or Pinecone (a hosted vector database)). The tag information and textual descriptions of the important characters are stored in a text database. The vector database and the text database together constitute the retrieval database. Important characters are those known characters.
[0095] See Figure 2 , Figure 3 As shown, the process corresponding to this specific embodiment is as follows: After the system starts, it waits for the user to enter a text description of the target persona via a text box. For example, "Looking for a genius detective with exceptionally high IQ but questionable EQ." The system inputs the user-provided textual description of the target persona and the initial prompt text into a primary language model (e.g., a finely tuned GPT-4 (a large-scale multimodal model) or an equivalent model). The primary language model performs pre-analysis. Based on the initial prompt text, the primary language model outputs two parts: Key tags: ["genius", "detective", "high IQ", "low EQ"]; Character traits (structured character traits): "This character is a detective focused on solving complex cases through logic and observation. His core driving force is intellectual challenge rather than a sense of justice. He exhibits significant difficulties in social interactions, struggling to understand or care about the emotions of others." The system performs initial screening in the retrieval database (text database) based on key tags. For example, it retrieves all candidate profiles that simultaneously contain the tags "detective" and "genius," resulting in a small candidate set containing multiple candidate profiles. Then, it retrieves the feature vector corresponding to each candidate profile from the pre-stored candidate set in the retrieval database (vector database). .
[0096] The system inputs persona features into a text embedding model to generate query vectors. .
[0097] The system traverses the candidate set and uses the cosine similarity algorithm to calculate the query vector. The feature vector corresponding to each candidate design The similarity between them.
[0098] The candidate sets are sorted from high to low based on similarity.
[0099] The system selects the top few (e.g., top 5) candidate personas after ranking as reference personas. The system takes the text description information, similarity score, and ranking corresponding to each reference persona, the user's initial text description information for the target persona, and the second prompt text as input, and submits them to the second language model. The second language model then generates a detailed comparative analysis for each reference persona, forming the search results.
[0100] The system will output the search results to the user.
[0101] Users can select one or more reference characters based on the search results for further action.
[0102] In this embodiment, a closed-loop learning mechanism based on user feedback can be introduced to dynamically fine-tune each model according to the user's final choice, making it increasingly closer to business needs.
[0103] This application employs a large language model at both the input and output ends of the retrieval process, achieving deep semantic understanding and credible interpretation of personas, and deeply coupling this capability with vector retrieval technology. While ensuring retrieval efficiency, it enhances the accuracy, depth, and interpretability of the results.
[0104] The technical solutions provided in this application have certain strategic and commercial value for film and television content companies, game development companies, and others. On the content creation side, they can accelerate script development and character design processes, provide inspiration for screenwriters, and assist casting directors in finding actors who match the character's temperament more scientifically and efficiently, thereby reducing production costs and improving content quality. On the platform operation side, this technical solution can be used to build more refined user profiles and content tagging systems, enabling personalized recommendations for viewers based on "character preferences," thereby enhancing user stickiness and platform competitiveness.
[0105] Corresponding to the above method embodiments, this application also provides a persona retrieval device. The persona retrieval device described below can be referred to in correspondence with the persona retrieval method described above.
[0106] See Figure 4 As shown, the character retrieval device 400 may include the following modules: The request receiving module 410 is used to receive a search request for a target persona, and the search request carries textual description information of the target persona. The first acquisition module 420 is used to perform semantic understanding on the textual description information of the target persona using the first large language model, and to obtain the key tags and persona features of the target persona output by the first large language model. The character screening module 430 is used to screen out candidate characters from known characters based on the key tags of the target character and the tag information corresponding to each known character in the pre-built retrieval database. The character designation module 440 is used to determine a reference character based on the character designation features of the target character designation and the feature information corresponding to each candidate character designation in the database. The second acquisition module 450 is used to compare and analyze the text description information corresponding to each reference persona and the text description information of the target persona in the retrieval database using the second large language model, and obtain the retrieval results output by the second large language model. The retrieval results include the reasons for the similarity between the reference persona and the target persona.
[0107] The apparatus provided in this application uses a large language model at both ends of the retrieval process to achieve deep semantic understanding and credible interpretation of personas, realizing end-to-end intelligent persona retrieval, reducing manual intervention, lowering operating costs, and improving the efficiency and accuracy of persona retrieval.
[0108] In some embodiments of this application, the character designation module 440 is specifically used for: Based on the characteristics of the target persona and the characteristic information corresponding to each candidate persona in the retrieval database, the similarity between the target persona and each candidate persona is determined. Based on similarity, a reference persona is determined from the candidate personas.
[0109] In some embodiments of this application, a first conversion module is also included, for: Before determining the similarity between the target persona and each candidate persona based on the persona features of the target persona and the feature information corresponding to each candidate persona in the retrieval database, the persona features of the target persona are converted into query vectors. Convert the feature information corresponding to each candidate into a feature vector; Character design module 440 is specifically used for: Based on the query vector corresponding to the target persona and the feature vector corresponding to each candidate persona, the similarity between the target persona and each candidate persona is determined.
[0110] In some embodiments of this application, a second conversion module is also included, for: Before determining the similarity between the target persona and each candidate persona based on the persona features of the target persona and the feature information corresponding to each candidate persona in the retrieval database, the persona features of the target persona are converted into query vectors. Each candidate's profile corresponds to a feature vector. The profile determination module 440 is specifically used for: Based on the query vector corresponding to the target persona and the feature vector corresponding to each candidate persona, the similarity between the target persona and each candidate persona is determined.
[0111] In some embodiments of this application, the character designation module 440 is specifically used for: The top N candidate personas with the highest similarity to the target persona are identified as reference personas, where N is an integer greater than or equal to 1. Alternatively, candidate personas with a similarity to the target persona greater than or equal to the similarity threshold can be identified as reference personas.
[0112] In some embodiments of this application, the persona screening module 430 is specifically used for: For each known persona, if the tag information corresponding to the known persona in the pre-built retrieval database includes at least M key tags of the target persona, then the known persona is determined as a candidate persona, where M is an integer greater than or equal to 1.
[0113] In some embodiments of this application, a construction module is also included for constructing a retrieval database through the following steps: Determine the key character designs from the script library; Extract the text description information corresponding to each important character; Based on the text description information corresponding to each important persona, determine the tag information and feature information corresponding to each important persona; Each important persona is treated as a known persona, and the corresponding text description information, tag information, and feature information are stored in the retrieval database.
[0114] In some embodiments of this application, the search results also include at least one of the following: Ranking based on character design; Reference character design name; Similarity between the reference persona and the target persona.
[0115] Regarding the apparatus in the above embodiments, the specific manner in which each module performs its operation has been described in detail in the embodiments related to the method, and will not be elaborated upon here.
[0116] This application also provides an electronic device, such as... Figure 5 As shown, it includes a processor 501, a communication interface 502, a memory 503, and a communication bus 504, wherein the processor 501, the communication interface 502, and the memory 503 communicate with each other through the communication bus 504. Memory 503 is used to store computer programs; When processor 501 executes the program stored in memory 503, it performs the following steps: Receive a search request targeting a specific persona, with the search request carrying a textual description of the target persona; The first language model is used to perform semantic understanding on the textual description information of the target persona, and the key tags and persona features of the target persona output by the first language model are obtained. Based on the key tags of the target persona and the tag information corresponding to each known persona in the pre-built retrieval database, candidate personas are selected from the known personas; Based on the characteristics of the target persona and the characteristic information corresponding to each candidate persona in the retrieval database, a reference persona is determined from the candidate personas; The second language model is used to compare and analyze the text description information corresponding to each reference persona and the target persona in the retrieval database, and the retrieval results output by the second language model are obtained. The retrieval results include the reasons for the similarity between the reference persona and the target persona.
[0117] The aforementioned communication bus 504 can be a Peripheral Component Interconnect (PCI) bus or an Extended Industry Standard Architecture (EISA) bus, etc. This communication bus 504 can be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, it is represented by only one thick line in the figure, but this does not indicate that there is only one bus or one type of bus.
[0118] Communication interface 502 is used for communication between the aforementioned terminal and other devices.
[0119] The memory 503 may include random access memory (RAM) or non-volatile memory, such as at least one disk storage device. Optionally, the memory 503 may also be at least one storage device located remotely from the aforementioned processor.
[0120] The processor 501 mentioned above can be a general-purpose processor, including a central processing unit (CPU), a network processor (NP), etc.; it can also be a digital signal processor (DSP), an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA), or other programmable logic devices, discrete gate or transistor logic devices, or discrete hardware components.
[0121] In another embodiment provided in this application, a computer-readable storage medium is also provided, which stores instructions that, when executed on a computer, cause the computer to perform the steps of any of the fault handling methods described in the above embodiments.
[0122] In another embodiment provided in this application, a computer program product containing instructions is also provided, which, when run on a computer, causes the computer to perform the steps of any of the fault handling methods in the above embodiments.
[0123] In the above embodiments, implementation can be achieved, in whole or in part, through software, hardware, firmware, or any combination thereof. When implemented in software, it can be implemented, in whole or in part, as a computer program product. A computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, all or part of the flow or function according to the embodiments of this application is generated. The computer can be a general-purpose computer, a special-purpose computer, a computer network, or other programmable device. The computer instructions can be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another. For example, computer instructions can be transmitted from one website, computer, server, or data center to another website, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, digital subscriber line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.) means. The computer-readable storage medium can be any available medium that a computer can access or a data storage device such as a server or data center that integrates one or more available media. The available medium can be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid-state disk (SSD)).
[0124] It should be noted that, in this document, relational terms such as "first" and "second" are used merely to distinguish one entity or operation from another, and do not necessarily require or imply any such actual relationship or order between these entities or operations. Furthermore, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Without further limitations, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes the element.
[0125] The various embodiments in this specification are described in a related manner. Similar or identical parts between embodiments can be referred to mutually. Each embodiment focuses on describing the differences from other embodiments. In particular, the system embodiments are basically similar to the method embodiments, so the description is relatively simple; relevant parts can be referred to the descriptions of the method embodiments.
[0126] The above are merely preferred embodiments of this application and are not intended to limit the scope of protection of this application. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of this application are included within the scope of protection of this application.
Claims
1. A persona retrieval method, characterized in that, include: Receive a search request for a target persona, the search request carrying textual description information of the target persona; The first language model is used to perform semantic understanding on the text description information of the target persona, and the key tags and persona features of the target persona output by the first language model are obtained. Based on the key tags of the target persona and the tag information corresponding to each known persona in the pre-built retrieval database, candidate personas are selected from the known personas; Based on the personality characteristics of the target persona and the characteristic information corresponding to each candidate persona in the retrieval database, a reference persona is determined from the candidate personas; The second language model is used to compare and analyze the text description information corresponding to each reference persona in the retrieval database with the text description information of the target persona, and the retrieval results output by the second language model are obtained. The retrieval results include the reasons for the similarity between the reference persona and the target persona.
2. The method of claim 1, wherein, The process of determining a reference persona from the candidate persons based on the persona characteristics of the target persona and the characteristic information corresponding to each candidate persona in the retrieval database includes: Based on the personality characteristics of the target persona and the characteristic information corresponding to each candidate persona in the retrieval database, the similarity between the target persona and each candidate persona is determined. Based on the aforementioned similarity, a reference persona is determined among the candidate personas.
3. The method of claim 2, wherein, Before determining the similarity between the target persona and each candidate persona based on the persona characteristics of the target persona and the feature information corresponding to each candidate persona in the retrieval database, the method further includes: Convert the personality characteristics of the target persona into a query vector; Convert the feature information corresponding to each candidate into a feature vector; The determination of the similarity between the target persona and each candidate persona, based on the persona characteristics of the target persona and the characteristic information corresponding to each candidate persona in the retrieval database, includes: Based on the query vector corresponding to the target persona and the feature vector corresponding to each candidate persona, the similarity between the target persona and each candidate persona is determined.
4. The method according to claim 2, characterized in that, Before determining the similarity between the target persona and each candidate persona based on the persona characteristics of the target persona and the feature information corresponding to each candidate persona in the retrieval database, the method further includes: Convert the personality characteristics of the target persona into a query vector; The feature information corresponding to each candidate persona includes a feature vector. The determination of the similarity between the target persona and each candidate persona, based on the persona features of the target persona and the feature information corresponding to each candidate persona in the retrieval database, includes: Based on the query vector corresponding to the target persona and the feature vector corresponding to each candidate persona, the similarity between the target persona and each candidate persona is determined.
5. The method according to claim 2, characterized in that, Based on the aforementioned similarity, reference personas are determined from the candidate personas, including: The top N candidate personas with the highest similarity to the target persona are determined as reference personas, where N is an integer greater than or equal to 1; Alternatively, candidate personas with a similarity greater than or equal to the similarity threshold with the target persona can be identified as reference personas.
6. The method according to claim 1, characterized in that, The step of filtering candidate personas from known personas based on the key tags of the target persona and the tag information corresponding to each known persona in a pre-built retrieval database includes: For each known persona, if the tag information corresponding to the known persona in the pre-constructed retrieval database includes at least M key tags of the target persona, then the known persona is determined as a candidate persona, where M is an integer greater than or equal to 1.
7. The method according to any one of claims 1, characterized in that, The retrieval database is constructed using the following steps: Determine the key character designs from the script library; Extract the text description information corresponding to each important character; Based on the text description information corresponding to each important persona, determine the tag information and feature information corresponding to each important persona; Each important persona is treated as a known persona, and the corresponding text description information, tag information, and feature information are stored in the retrieval database.
8. The method according to any one of claims 1 to 7, characterized in that, The search results also include at least one of the following: The ranking of the reference characters; The name of the reference character; The similarity between the reference persona and the target persona.
9. A persona retrieval device, characterized in that, include: The request receiving module is used to receive a search request for a target persona, wherein the search request carries textual description information of the target persona; The first acquisition module is used to perform semantic understanding on the text description information of the target persona using the first large language model, and to obtain the key tags and persona features of the target persona output by the first large language model. The persona filtering module is used to filter out candidate personas from known personas based on the key tags of the target persona and the tag information corresponding to each known persona in a pre-built retrieval database. The character designation module is used to determine a reference character from the candidate characters based on the character characteristics of the target character and the characteristic information corresponding to each candidate character in the retrieval database. The second acquisition module is used to compare and analyze the text description information corresponding to each reference persona in the retrieval database with the text description information of the target persona using the second large language model, and obtain the retrieval results output by the second large language model. The retrieval results include the reasons for the similarity between the reference persona and the target persona.
10. An electronic device, characterized in that, It includes a processor, a communication interface, a memory, and a communication bus, wherein the processor, the communication interface, and the memory communicate with each other through the communication bus; Memory, used to store computer programs; A processor, when executing a program stored in memory, implements the steps of the persona retrieval method as described in any one of claims 1 to 8.
11. A computer-readable storage medium having a computer program stored thereon, characterized in that, When the program is executed by the processor, it implements the steps of the persona retrieval method as described in any one of claims 1 to 8.
12. A computer program product comprising computer instructions stored in a computer-readable storage medium and adapted to be read and executed by a processor to cause an electronic device having the processor to perform the steps of the person retrieval method as described in any one of claims 1 to 9.