Intelligent interaction system, method, readable storage medium and electronic device

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By collecting, cleaning, and timestamping dialogue records in an intelligent interaction system, and combining semantic feature matching and large language models to generate responses, the problem of the lack of memory function in existing systems is solved, and a personalized and intelligent interactive experience is achieved.

CN122240748APending Publication Date: 2026-06-19INTERLATH (SHENZHEN) TECHNOLOGY CO LTD

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Applications(China)
Current Assignee / Owner: INTERLATH (SHENZHEN) TECHNOLOGY CO LTD
Filing Date: 2024-12-17
Publication Date: 2026-06-19

AI Technical Summary

Technical Problem

Existing intelligent interaction systems lack the ability to remember users' interaction history and cannot provide personalized responses, resulting in an unintelligent and unfriendly interactive experience.

Method used

The system collects user-system dialogue records through the data collection module, uses the data cleaning module to mark timestamps to form memory text, combines the semantic feature extraction module and the matching module to generate personalized responses, and uses the large language model calling module to generate responses to the user's current input.

Benefits of technology

It enables personalized responses based on users' historical behavior and interaction content, enhancing the intelligence and thoughtfulness of the interactive experience and improving user satisfaction.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN122240748A_ABST

Patent Text Reader

Abstract

This invention provides an intelligent interactive system, method, readable storage medium, and electronic device with memory function. The system includes modules for data collection, cleaning, semantic feature extraction, matching, prompt word construction, and large language model invocation. It collects user dialogue records, cleans them to form memory text, and stores it in a database. By extracting semantic features, it matches the user's current input with the memory text, constructs prompt words to trigger recall, and invokes a large language model to generate personalized responses. This invention solves the problem of existing intelligent interactive systems lacking the ability to remember user interaction history. It can remember user's historical interactions, accurately locate relevant memory text, provide high-quality text or voice responses, meet different needs, and enhance the personalization and intelligence of interaction.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of intelligent interaction technology, specifically to an intelligent interaction system, method, readable storage medium, and electronic device with memory function. Background Technology

[0002] With the continuous development of artificial intelligence technology, intelligent interactive systems are playing an increasingly important role in people's lives and work. These systems provide services such as information retrieval, task execution, and entertainment through interaction with users. Although the application of intelligent interactive systems is becoming increasingly widespread, existing systems often lack the ability to remember users' interaction history. This means that the system cannot remember users' past interactions and cannot use this information to provide more personalized services. Due to the lack of memory function, existing systems cannot generate personalized responses based on users' past interactions, which limits the system's intelligence and user satisfaction. The interactive experience of existing intelligent interactive systems is not intelligent or user-friendly enough; users may feel that the system's responses are mechanical and lack human touch, which affects user acceptance and frequency of use. Summary of the Invention

[0003] The main objective of this application is to provide an intelligent interactive system, method, readable storage medium, and electronic device with memory function, which solves the problem that existing intelligent interactive systems lack the ability to remember the user's interaction history. It enables personalized responses based on the user's past behavior and dialogue content, thereby improving the intelligence and thoughtfulness of the interactive experience.

[0004] To achieve the above objectives, the present invention proposes an intelligent interactive system, comprising:

[0005] The data collection module is used to collect dialogue records between users and the intelligent interaction system;

[0006] The data cleaning module is used to extract the user's behavioral experience from the dialogue records, mark the identified user's behavioral experience with timestamps to form memory text, and save it to the memory text library; wherein, the behavioral experience includes behavioral time and time information;

[0007] A semantic feature extraction module is used to extract semantic features from the dialogue record;

[0008] The semantic feature matching module is used to perform semantic feature similarity matching between the current user input text and the memory texts stored in the memory text library to determine the memory text with the highest similarity to the current user input text; wherein, the memory texts contain the user's historical behavior and interaction experience;

[0009] The prompt word construction module is used to combine the memory text with the highest similarity to the current user input text determined by the semantic feature matching module with the user's current input text to construct prompt words for triggering user recall; and

[0010] The large language model invocation module is used to invoke the large language model based on the prompt words constructed by the prompt word construction module to generate a response to the user's current input.

[0011] The prompt word construction module is used for:

[0012] Receive the memory text that has the highest similarity to the current user input text determined by the semantic feature matching module, and the user's current input;

[0013] Analyze the user's current input to understand the user's needs, intentions, and emotional state;

[0014] Extract key information from the most similar memory text; wherein, the key information is related to the user's historical interactions;

[0015] Design prompts by combining the analysis results of the user's current input with key information from the memorized text.

[0016] The data cleaning module is used for:

[0017] The user's behavioral history was extracted from the aforementioned conversation records;

[0018] Generate corresponding timestamps based on the time information in the user's behavioral experience, and construct a memory text by combining the user's behavioral experience with the timestamps; and

[0019] The memory text is stored in the memory text library.

[0020] The timestamp records the exact time information of the behavioral event and / or a relative time description.

[0021] The semantic feature matching module is used for:

[0022] Receive the user's current input text and convert it into a semantic feature vector;

[0023] The same semantic feature extraction is performed on each segment of memory text in the memory text library to form a semantic feature vector;

[0024] The semantic feature vector of the current input text is compared with the memory text vector in the memory text library. The memory text with the highest similarity is selected as the reference for generating the response.

[0025] The semantic features include, but are not limited to, word vectors, topic models, and sentiment analysis.

[0026] The semantic feature extraction module extracts semantic features from the dialogue records; wherein the dialogue records include, but are not limited to, text input by the user, reply text generated by the system based on the user input, historical records of previous interactions between the user and the system, text, images or videos in documents uploaded by the user, and posts on social media.

[0027] To achieve the above objectives, the present invention also proposes an intelligent interaction method, the method comprising:

[0028] Collect user conversation records with intelligent interaction systems;

[0029] The user's behavioral experience is extracted from the dialogue records, and the identified user behavioral experience is timestamped to form a memory text, which is then saved to the memory text library; wherein, the behavioral experience includes behavioral time and time information;

[0030] Extract semantic features from the dialogue records;

[0031] The current user input text is matched with the memory texts stored in the memory text database using semantic feature similarity to determine the memory text with the highest similarity to the current user input text; wherein, the memory texts contain the user's historical behavior and interaction experience;

[0032] By combining the identified memory text with the highest similarity to the current user input text with the current user input text, a prompt word is constructed to trigger the user's recall; and

[0033] The prompt words constructed by the prompt word construction module are used to call the large language model to generate a response to the user's current input.

[0034] Specifically, the method of combining the determined memory text with the current user input text with the current user input text to construct prompt words for triggering user recall includes:

[0035] Receive the memory text that is most similar to the current user input text and the user's current input;

[0036] Analyze the user's current input to understand the user's needs, intentions, and emotional state;

[0037] Extract key information from the most similar memory text; wherein, the key information is related to the user's historical interactions;

[0038] Design prompts by combining the analysis results of the user's current input with key information from the memorized text.

[0039] Specifically, the user's behavioral experiences are extracted from the dialogue records, and the identified user behavioral experiences are timestamped to form memory text, which is then saved to a memory text library. This process includes:

[0040] The user's behavioral history was extracted from the aforementioned conversation records;

[0041] Generate a corresponding timestamp based on the time information in the user's behavioral experience, and combine the user's behavioral experience with the timestamp to construct a memory text;

[0042] The memory text is stored in the memory text library.

[0043] The timestamp records the exact time information of the behavioral event and / or a relative time description.

[0044] Specifically, the semantic feature similarity matching between the current user input text and the memory texts stored in the memory text database is performed to determine the memory text with the highest similarity to the current user input text. This includes:

[0045] Receive the user's current input text and convert it into a semantic feature vector;

[0046] The same semantic feature extraction is performed on each segment of memory text in the memory text library to form a semantic feature vector;

[0047] The semantic feature vector of the current input text is compared with the memory text vector in the memory text library. The memory text with the highest similarity is selected as the reference for generating the response.

[0048] The semantic features include, but are not limited to, word vectors, topic models, and sentiment analysis.

[0049] Semantic features are extracted from the dialogue records, which include, but are not limited to, text input by the user, reply text generated by the system based on the user input, historical records of previous interactions between the user and the system, text, images or videos in documents uploaded by the user, and posts on social media.

[0050] To achieve the above objectives, the present invention also proposes a readable storage medium storing a computer program, the computer program including program instructions, which, when executed by a processor of an electronic device, cause the processor to perform the steps of the intelligent interaction method described above.

[0051] To achieve the above objectives, the present invention also proposes an electronic device, comprising: a processor and a memory, wherein the memory is used to store computer program code, the computer program code including computer instructions, and when the processor executes the computer instructions, the electronic device performs the steps of the intelligent interaction method described above.

[0052] The beneficial effects of this application are as follows: Unlike existing technologies, this application provides an intelligent interactive system, method, readable storage medium, and electronic device with memory function. This system can remember the user's interaction history and behavioral experiences, providing more personalized and intelligent interactive services. Through semantic feature extraction and matching, it can accurately find the memory text related to the user's current input, providing more information and background knowledge for generating responses. The prompt word construction module can explicitly specify the reference memory text, allowing users to experience the memory function of the interactive system from the responses, thus improving the personalization and intelligence of the interaction. Utilizing a large language model to generate responses can provide high-quality text and voice responses, meeting the diverse needs of users. Attached Figure Description

[0053] To more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are only some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on the structures shown in these drawings without creative effort.

[0054] Figure 1 This is a schematic diagram of one embodiment of an intelligent interactive system with memory function provided by the present invention;

[0055] Figure 2 This is a flowchart illustrating one embodiment of an intelligent interaction method with memory function provided by the present invention;

[0056] Figure 3 yes Figure 2 A flowchart illustrating one embodiment of step S21 is shown.

[0057] Figure 4 yes Figure 2 A flowchart illustrating one embodiment of step S23 is shown.

[0058] Figure 5 yes Figure 2 A flowchart illustrating one embodiment of step S24 is shown.

[0059] Figure 6 This is a schematic diagram of the hardware structure of an electronic device provided by the present invention.

[0060] The reference numerals used in the above figures are explained as follows:

[0061] Intelligent Interaction System; 11. Data Collection Module; 12. Data Cleaning Module; 13. Semantic Feature Extraction Module; 14. Semantic Feature Matching Module; 15. Prompt Word Construction Module; 16. Large Language Model Calling Module;

[0062] 3. Electronic device; 31. Processor; 32. Memory; 33. Input device; 34. Output device. Detailed Implementation

[0063] The present application will now be described in further detail with reference to the accompanying drawings and embodiments. It should be noted that the following embodiments are for illustrative purposes only and do not limit the scope of the application. Similarly, the following embodiments are only some, not all, embodiments of the present application, and all other embodiments obtained by those skilled in the art without inventive effort are within the scope of protection of this application.

[0064] The terms "first," "second," and "third" in this application are used for descriptive purposes only and should not be construed as indicating or implying relative importance or implicitly specifying the number of technical features indicated. Therefore, a feature defined as "first," "second," or "third" may explicitly or implicitly include at least one of that feature. In the description of this application, "multiple" means at least two, such as two, three, etc., unless otherwise explicitly specified. All directional indications (such as up, down, left, right, front, back, etc.) in the embodiments of this application are only used to explain the relative positional relationships and movements between components in a specific orientation (as shown in the figures). If the specific orientation changes, the directional indications also change accordingly. Furthermore, the terms "comprising" and "having," and any variations thereof, are intended to cover non-exclusive inclusion. A process, method, system, product, or device that includes a series of steps or units is not limited to the listed steps or units, but may optionally include steps or units not listed, or may optionally include other steps or units inherent to these processes, methods, products, or devices.

[0065] In this document, the term "implementation" means that a specific feature, structure, or characteristic described in connection with an implementation may be included in at least one implementation of this application. The appearance of this phrase in various places in the specification does not necessarily refer to the same implementation, nor is it a separate or alternative implementation mutually exclusive with other implementations. It will be explicitly and implicitly understood by those skilled in the art that the implementations described herein can be combined with other implementations.

[0066] Please see Figure 1This is a schematic diagram of the structure of an intelligent interactive system with memory function provided by the present invention. Specifically, the intelligent interactive system 10 includes: a data collection module 11, a data cleaning module 12, a semantic feature extraction module 13, a semantic feature matching module 14, a prompt word construction module 15, and a large language model invocation module 16. These modules work together to remember and utilize the user's interaction history in order to provide more personalized and intelligent interactive services.

[0067] The data collection module 11 is used to collect dialogue records between the user and the intelligent interaction system 10. These dialogue records may include various forms of data such as text and voice to meet the needs of different user interactions. The collected dialogue records form the basis for subsequent processing and analysis.

[0068] Specifically, the data collection module 11 collects dialogue records in real time by listening to the interaction interface between the user and the intelligent interaction system 10; and stores the collected dialogue records in a local database or cloud server. This storage method ensures the persistence and accessibility of the data so that subsequent modules can access and process it.

[0069] The data collection module 11 not only collects text data but also various other forms of data, such as voice. This multimodal data collection capability allows the system to adapt to different users' interaction habits and provide a more comprehensive interactive experience. Furthermore, the data collection module 11 can collect dialogue records in real time, meaning the system can respond instantly to user input, whether text or voice. This real-time capability is key to improving user experience and system responsiveness. The dialogue records collected by the data collection module 11 include user behavior history, encompassing not only direct input but also potential behavioral patterns and interaction history. This comprehensiveness helps the system understand users more deeply and provide more personalized services.

[0070] The data cleaning module 12 is used for:

[0071] The user's behavioral history is extracted from the dialogue records; wherein, the behavioral history includes the behavior time and time information;

[0072] The identified behavior time and time information are timestamped to form memory text; and

[0073] The memory text is stored in the memory text library.

[0074] The memory text library can be stored in the form of a database, file system, or other similar formats to facilitate quick access and querying by subsequent modules.

[0075] Specifically, the data cleaning module 12 sorts out the user's behavioral experience from the dialogue records, including identifying the activities (behavioral events) mentioned by the user in the dialogue and the time when these activities occurred (time information).

[0076] The data cleaning module 12 uses Natural Language Processing (NLP) technology to analyze and extract information from the dialogue records to identify user behavior events, time information, etc. Specifically, the data cleaning module 12 uses NLP toolkits (such as NLTK, Spacy, etc.) to clean and analyze the dialogue records, identifying keywords and phrases. These toolkits provide functions such as noise removal, word segmentation, part-of-speech tagging, and named entity recognition. Behavioral events refer to the user's activities, such as where they went, where they ate, what they did, and what they ate. Time information is derived from the time and content of the dialogue, indicating the time of the user's activity. For example, if the dialogue occurred at 6 PM on October 23, 2024, and the content includes "This afternoon I…", then the activity occurred on "October 23, 2024, afternoon". Through NLP technology, the data cleaning module 12 identifies keywords and phrases in the dialogue records to determine the user's behavioral events (such as "where they went" and "where they ate") and time information (such as "this afternoon" and "October 23, 2024, afternoon").

[0077] The data cleaning module 12 also generates a corresponding timestamp based on the time of the dialogue and the time information mentioned in the dialogue content, and combines the identified behavioral events and time information with the timestamp to construct a memory text.

[0078] A timestamp is a sequence of characters used to identify the time when a specific event occurred. In the intelligent interaction system 10, a timestamp can record the exact date and time of the behavioral event, such as "October 23, 2024, 6 PM," or it can be a relative time description, such as "two days ago," "last month," or "last Wednesday." The data cleaning module 12 uses date and time parsing technology in natural language processing to convert the identified time expression into a standardized time format. For example, it converts "this afternoon" into a specific date and time (if the conversation occurred on October 23, 2024, it might be "2024-10-23 15:00").

[0079] Specifically, when a conversation occurs, the system automatically records the exact time of the conversation and attaches it as a timestamp to the behavioral event; alternatively, the system needs a time reference point (usually the current time) and then calculates a relative time description based on the absolute timestamp of the behavioral event. For example, if today is November 29, 2024, and the event occurred on November 27, 2024, the system will describe it as "two days ago".

[0080] The data cleaning module 12 organizes the identified behavioral events and time information into natural language to form a memory text. This memory text not only includes a description of the event but also the time when the event occurred, making the memory text richer and more specific. For example, combining "I went for a walk in the park" and "2024-10-23 15:00" forms "On the afternoon of October 23, 2024, I went for a walk in the park"; or combining "wanted to try a new Italian restaurant" and "2024-10-21" forms "Two days ago, you mentioned wanting to try a new Italian restaurant."

[0081] Timestamps allow systems to reference past user actions in subsequent interactions, especially when it's necessary to evoke user memories. By using relative time descriptions, systems can engage in more natural conversations with users, such as, "You mentioned a restaurant you wanted to go to two days ago, when are you planning to go?" Furthermore, timestamps help systems understand the sequence of user behavior, thus providing a more personalized interactive experience. For example, systems can recommend relevant content or services based on a user's past activities.

[0082] The semantic feature extraction module 13 is used to extract semantic features from the dialogue record. These semantic features include, but are not limited to, word vectors, topic models, and sentiment analysis.

[0083] Word vectors: Using deep learning models such as Word2Vec and GloVe, semantic relationships between words are captured, converting the vocabulary in the dialogue record into numerical vectors for computer processing and analysis, providing a foundation for subsequent semantic matching and analysis. For example, the deep learning module Word2Vec learns word vector representations by predicting words in the context or the context of a given word; Global Vectors for WordRepresentation trains word vectors by statistically analyzing the co-occurrence frequencies of words in the corpus.

[0084] Topic modeling: Extracts semantic features of text through methods such as clustering and topic modeling. For example, Latent Dirichlet Allocation (LDA) is used to obtain the topic distribution of each document and the vocabulary distribution of each topic by sampling through Dirichlet distribution, thereby discovering the topic structure in the text data and extracting semantic information from different dimensions, enhancing the depth and breadth of semantic understanding.

[0085] Sentiment analysis: Understanding users' emotional inclinations provides more reference information for generating responses. Using deep learning models such as BERT, text is classified according to sentiment, capturing deep semantic information. This is suitable for sentiment analysis tasks. The use of sentiment analysis allows the system not only to understand the content of the text but also to perceive the user's emotional state, further enhancing the naturalness and personalization of the interaction.

[0086] The semantic feature extraction module 13 can more accurately match and analyze users' intentions and needs by extracting semantic features such as word vectors, topic models, and sentiment analysis. Semantic feature extraction enables the system to better understand users' intentions, thereby providing more personalized responses. By deeply understanding users' semantic needs, the system can generate a more natural and considerate interactive experience.

[0087] Furthermore, the semantic feature extraction module 13 is used to extract semantic features from the dialogue records, wherein the dialogue records may include: text information directly input by the user, which may be questions, statements, commands, etc.; reply text generated by the system based on the user's input; historical records of previous interactions between the user and the system, which may contain information such as the user's behavioral experiences, preferences, and emotional expressions; and may also include text, images or videos in documents uploaded by the user, posts on social media, etc., all of which can serve as text data sources for the interaction system to analyze and understand the user's intent.

[0088] The semantic feature matching module 14 is used to perform semantic feature similarity matching between the current user input text and the memory texts stored in the memory text library, so as to determine the memory text with the highest similarity to the current user input text. The memory texts include the user's historical behaviors and interaction experiences.

[0089] The semantic feature matching module 14 identifies the memory text of past interactions most relevant to the current user input by matching semantic features; using the matched memory text, it provides more information and background knowledge for generating a response, thereby improving the personalization and relevance of the response.

[0090] Specifically, the semantic feature matching module 14 receives the user's current input text and converts it into a semantic feature vector; it performs the same semantic feature extraction on each segment of memory text in the memory text library to form a semantic feature vector; it calculates the similarity between the semantic feature vector of the current input text and the memory text vector in the memory text library, and selects the memory text with the highest similarity as a reference for generating a response.

[0091] In this embodiment, the semantic feature matching module 14 converts the user's current input text and each segment of the memory text in the memory text library into semantic feature vectors using word vector models (such as Word2Vec, GloVe) or context embedding (such as BERT). The semantic feature matching module 14 uses algorithms such as cosine similarity and Euclidean distance to calculate the similarity between the semantic feature vector of the current input text and the semantic feature vector of each memory text in the memory text library, compares all the calculated similarity values, and finds the memory text with the highest similarity to the current input text.

[0092] The prompt word construction module 15 is used to combine the memory text with the highest similarity to the current user input text determined by the semantic feature matching module 14 with the current user input text to construct prompt words for triggering user recall.

[0093] Specifically, the prompt word construction module 15 receives relevant memory text retrieved by the semantic feature matching module 14 and the user's current input; it analyzes the user's current input to understand the user's needs, intentions, and emotional state. In this embodiment, the prompt word construction module 15 uses natural language processing technology to identify keywords, phrases, and emotional tendencies in the user's input. Further, the prompt word construction module 15 extracts key information from the memory text; wherein, the key information is related to the user's historical interactions. The memory text contains the user's past interaction records; extracting key information from it can help construct prompt words related to the user's history, increasing the personalization and intimacy of the response. The prompt word construction module 15 combines the analysis results of the user's current input and the key information from the memory text to design prompt words. The design of prompt words needs to integrate the user's current needs and past interactions to create prompt words that are both relevant and evoke the user's memory. The prompt word construction module 15 uses natural language generation technology to construct natural and fluent prompt words and outputs the constructed prompt words to the large language model 16. Natural language generation techniques, such as Transformer or GPT-2, can help the cue word building module 15 generate cue words that sound natural and fluent, which can guide the large language model to generate more natural responses.

[0094] For example, a user tells the system, "Today I went to..." (the user describes their activities today). The semantic feature matching module 14 retrieves relevant memory text based on the user's current input, such as, "On xx year xx month xx day, the user went to the same place." The prompt word construction module 15 receives the user input and the relevant memory text, and begins constructing prompt words. Using natural language generation technology, it generates prompt words based on the following information:

[0095] The user's current input is: "Today I went to..."

[0096] Memory text: "On xx year xx month xx day, the user went to the same place to play."

[0097] Example of generated prompts: “It sounds like you had a lot of fun today… Do you remember you went to this place last month too…” (Please refer to the memory text to respond to user input and evoke the user’s memory).

[0098] By explicitly specifying the reference text in the prompts and designing dialogue to guide users to recall past interactions, users can experience the memory function of the interactive system from the responses, thereby improving the personalization and intelligence of the interaction.

[0099] The prompt word construction module 15 designs prompt words by combining the analysis results of the user's current input and key information from the memory text. Specifically, the prompt word first directly responds to the user's current input, i.e., "Today I went to...", by saying "It sounds like you had a lot of fun today...", demonstrating the system's understanding and concern for the user's current state. Next, the prompt word references information from the memory text, i.e., the user's previous activities at the same location, by asking, "Do you remember you went to this place last month too...?", demonstrating not only its memory function but also guiding the user to recall past interactions. Through this method of referencing past interactions, the system can evoke user memories, increasing the personalization and intimacy of the interaction, and may also guide users to share more details about their past experiences. By explicitly specifying the reference memory text in the prompt word, the system allows users to feel that it is not just a simple reply machine, but an intelligent system that can remember and reference the user's past interactions.

[0100] The prompt word construction module 15 embeds key information from the remembered text into the prompt words, enabling the large model to generate more personalized responses by referencing the user's historical interaction records. To enhance the personalization and intimacy of the interaction, the prompt words are specifically designed to trigger users' memories of past experiences, thereby improving user satisfaction and loyalty. Utilizing natural language generation technology ensures the natural fluency and effectiveness of the prompt words, allowing users to experience the system's memory function and making the entire interaction process more natural and human-like. Furthermore, these carefully constructed prompt words ensure that the large model's responses are closely related to the user's current input and historical interactions, thus improving the accuracy and relevance of the responses. By evoking user recollections in this way, the system can encourage deeper user participation in the dialogue, increasing the depth and richness of the interaction and providing users with a richer and more engaging communication experience.

[0101] The large language model invocation module 16 is used to invoke the large language model based on the prompt words constructed by the prompt word construction module 15, and generate a response to the user's current input. The response can be in text or voice format, selected and converted according to the user's needs.

[0102] Specifically, the large language model invocation module 16, through an API interface or by directly calling the code library of the large language model, passes the prompt words generated by the prompt word construction module 15 as input to the large language model; it then uses the deep learning neural network and language generation capabilities of the large language model (such as GPT, BERT, etc.) combined with the reference memory text in the prompt words to generate a response; and finally, it outputs the response generated by the large language model to the user in the form of text or speech. The response generated by the large language model based on the prompt words takes into account both the user's current input and the user's historical interactions.

[0103] The large language model invocation module 16 uses constructed prompts to invoke powerful large language models, such as GPT or BERT, to generate text or voice responses for the user. These prompts include reference memory text, enabling the large language model invocation module 16 to generate highly personalized responses that precisely meet the user's needs. By leveraging the language generation capabilities of the large language model, the large language model invocation module 16 provides users with high-quality text or voice responses, significantly enhancing the user experience and making the dialogue more natural and personalized. Furthermore, the combination of responses with memory text not only improves the system's intelligence, demonstrating its ability to remember and reference the user's historical interactions, but also provides flexible interaction methods, supporting both text and voice responses to meet the specific needs of different users. This comprehensive approach, utilizing both memory text and the large language model, not only enhances the personalization of the interaction but also makes the intelligent interaction system more flexible and user-friendly when providing services.

[0104] The following is a detailed application example that explains the working principle of the intelligent interaction system.

[0105] Suppose user Alice shared her trip to Thailand last summer with the intelligent interaction system, including the places she visited, the food she tried, and photos from the trip. The system recorded this conversation as a memory text and stored it in the memory text library, marked with the timestamp "July 15, 2023". Several months later, Alice spoke with the system again, mentioning that she was planning her next vacation.

[0106] Alice: "I'm thinking about where to go on vacation this summer."

[0107] The semantic feature matching module 14 extracts keywords such as "summer" and "vacation" from Alice's current input; it matches these keywords with the content in the memory text library and finds that Alice's memory text of traveling to Thailand last summer is related to the current topic.

[0108] The cue word construction module 15 receives memory text related to Alice's trip to Thailand last year and constructs a cue word to guide the large language model to generate a response that both responds to Alice's current plans and evokes her memories of last year's trip.

[0109] Example of generated prompt words:

[0110] Memory text: On July 15, 2023, Alice shared her travel experience in Thailand, including the places she visited, the food she tried, and photos from the trip.

[0111] User input: I'm thinking about where to go on vacation this summer.

[0112] The system replied: "It sounds like you've already started planning your summer vacation, that's exciting! Remember your trip to Thailand last summer? You shared those wonderful experiences, including delicious Thai food and beautiful beaches. Maybe you could consider going again, or exploring a new destination to experience different cultures and cuisines."

[0113] This response not only directly addressed Alice's current input about planning her vacation, but also evoked memories of her past experiences by referencing her text about a trip to Thailand last summer. This personalized response enhanced the coherence and intimacy of the conversation, making the interactive experience richer and more human.

[0114] As described above, this application provides an intelligent interactive system with memory function, capable of remembering a user's interaction history and behavioral experiences, providing users with more personalized and intelligent interactive services. Through semantic feature extraction and matching, it can accurately find the memory text related to the user's current input, providing more information and background knowledge for generating responses. The prompt word construction module can explicitly specify the reference memory text, allowing users to experience the memory function of the interactive system from the responses, improving the personalization and intelligence of the interaction. Utilizing a large language model to generate responses can provide high-quality text and voice responses, meeting the diverse needs of users.

[0115] Please see Figure 2 This is a flowchart illustrating the first embodiment of an intelligent interaction method with memory function provided by the present invention. The method includes the following steps:

[0116] Step S20: Collect the dialogue records between the user and the intelligent interaction system 10.

[0117] The dialogue records may include various forms of data such as text and voice to meet the needs of different user interactions. The collected dialogue records are the basis for subsequent processing and analysis.

[0118] Specifically, the system collects dialogue records in real time by listening to the interaction interface between the user and the intelligent interaction system 10; and stores the collected dialogue records in a local database or cloud server. This storage method ensures the persistence and accessibility of the data so that subsequent modules can access and process it.

[0119] Step S21: Extract the user's behavioral experience from the dialogue records and mark the identified behavioral experience with timestamps to form memory text.

[0120] The behavioral experience includes the behavioral time and time information.

[0121] Please also refer to Figure 3 Step S21, namely, extracting the user's behavioral experiences from the dialogue records and marking the identified behavioral experiences with timestamps to form memory text, specifically includes:

[0122] Step S210: Organize the user's behavioral experience from the dialogue record, including identifying the activities mentioned by the user in the dialogue (behavioral events) and the time when these activities occurred (time information).

[0123] In this embodiment, Natural Language Processing (NLP) technology is used to analyze and extract dialogue records to identify user behavioral events, time information, etc. Behavioral events refer to the user's activities, such as where they went, where they ate, what they did, and what they ate. Time information is derived from the time and content of the dialogue to determine the time of the user's activity. For example, if the dialogue occurred at 6 PM on October 23, 2024, and the content included "This afternoon I…", then the activity occurred on "the afternoon of October 23, 2024".

[0124] Step S211: Generate a corresponding timestamp based on the time information in the user's behavioral experience, and combine the user's behavioral experience with the timestamp to construct a memory text.

[0125] The timestamp can record the exact date and time of the event, such as "October 23, 2024, 6 PM", or it can be a relative time description, such as "two days ago", "last month", "last Wednesday", etc.

[0126] Furthermore, the memory text is stored in a memory text library. This memory text library can be stored in the form of a database, file system, or similar formats to facilitate quick access and querying by subsequent modules.

[0127] Step S212: Store the memory text in the memory text library.

[0128] The memory text library can be stored in the form of a database, file system, or other similar formats to facilitate quick access and querying by subsequent modules.

[0129] Step S22: Extract semantic features from the dialogue record.

[0130] The semantic features include, but are not limited to, word vectors, topic models, and sentiment analysis.

[0131] Word vectors: Using deep learning models such as Word2Vec and GloVe, semantic relationships between words are captured, converting the vocabulary in the dialogue record into numerical vectors for computer processing and analysis, providing a foundation for subsequent semantic matching and analysis. For example, the deep learning module Word2Vec learns word vector representations by predicting words in the context or the context of a given word; Global Vectors for WordRepresentation trains word vectors by statistically analyzing the co-occurrence frequencies of words in the corpus.

[0132] Topic modeling: Extracts semantic features of text through methods such as clustering and topic modeling. For example, Latent Dirichlet Allocation (LDA) is used to obtain the topic distribution of each document and the vocabulary distribution of each topic by sampling through Dirichlet distribution, thereby discovering the topic structure in the text data and extracting semantic information from different dimensions, enhancing the depth and breadth of semantic understanding.

[0133] Sentiment analysis: Understanding users' emotional inclinations provides more reference information for generating responses. Using deep learning models such as BERT, text is classified according to sentiment, capturing deep semantic information. This is suitable for sentiment analysis tasks. The use of sentiment analysis allows the system not only to understand the content of the text but also to perceive the user's emotional state, further enhancing the naturalness and personalization of the interaction.

[0134] Furthermore, the dialogue records may include: text information directly entered by the user, which may be questions, statements, commands, etc.; response text generated by the system based on the user's input; historical records of previous interactions between the user and the system, which may contain information such as the user's behavioral experiences, preferences, and emotional expressions; and may also include text, images or videos from documents uploaded by the user, posts on social media, etc., all of which can serve as text data sources for the interactive system to analyze and understand the user's intent.

[0135] Step S23: Perform semantic feature similarity matching between the current user input text and the memory texts stored in the memory text library to determine the memory text with the highest similarity to the current user input text.

[0136] The memory text contains the user's historical behavior and interaction experience.

[0137] Please also refer to Figure 4 Step S23 involves performing semantic feature similarity matching between the current user input text and the memory texts stored in the memory text database to determine the memory text with the highest similarity to the current user input text. Specifically, this includes:

[0138] Step S230: Receive the user's current input text and convert it into a semantic feature vector;

[0139] In this embodiment, the user's current input text and each piece of text in the memory text library are converted into semantic feature vectors through word vector models (such as Word2Vec, GloVe) or context embedding (such as BERT).

[0140] Step S231: Extract semantic features from each segment of memory text in the memory text database in the same way to form a semantic feature vector;

[0141] Step S232: Calculate the similarity between the semantic feature vector of the current input text and the memory text vector in the memory text library, and select the memory text with the highest similarity as a reference for generating a response.

[0142] In this embodiment, algorithms such as cosine similarity and Euclidean distance are used to calculate the similarity between the semantic feature vector of the current input text and the semantic feature vector of each memory text in the memory text library. All calculated similarity values are compared to find the memory text with the highest similarity to the current input text.

[0143] Step S24: Combine the memory text with the highest similarity to the current user input text with the current user input text to construct a prompt word to trigger the user's recall.

[0144] Please also refer to Figure 5Step S24 involves combining the remembered text with the highest similarity to the current user input text with the current user input text to construct a prompt word to trigger the user's recall, specifically including:

[0145] Step S240: Receive the retrieved relevant memory text and the user's current input;

[0146] Step S241: Analyze the user's current input to understand the user's needs, intentions, and emotional state;

[0147] In this embodiment, natural language processing technology is used to identify keywords, phrases, and sentiment in user input.

[0148] Step S242: Extract key information from the memorized text;

[0149] The key information mentioned here is related to the user's historical interactions. The memory text contains the user's past interaction records, and extracting key information from it can help construct prompts related to the user's history, increasing the personalization and friendliness of the response.

[0150] Step S243: Design prompt words by combining the analysis results of the user's current input with the key information of the memorized text.

[0151] The design of prompts needs to integrate the user's current needs with past interactions to create prompts that are both relevant and memorable. In this embodiment, natural language generation techniques are used to construct natural and fluent prompts. Natural language generation techniques, such as Transformer or GPT-2, can help generate prompts that sound natural and fluent, which can guide large language models to generate more natural responses.

[0152] Step S25: Based on the constructed prompt words, call the large language model to generate a response to the user's current input.

[0153] The response can be in text or voice format, depending on the user's needs.

[0154] Specifically, the generated prompt words are passed as input to the large language model via an API interface or by directly calling the model's code library. The large language model (such as GPT or BERT) utilizes its deep learning neural network and language generation capabilities, combined with the reference memory text in the prompt words, to generate a response. The response generated by the large language model is then output to the user in text or speech format. The response generated by the large language model based on the prompt words takes into account both the user's current input and their historical interactions.

[0155] In this embodiment, the advantages and beneficial effects of the intelligent interaction method with memory function have been described above and will not be repeated here. Furthermore, since the intelligent interaction method with memory function is applied to the intelligent interaction system with memory function, the intelligent interaction method with memory function also has the same advantages and beneficial effects.

[0156] One embodiment of the present invention also provides a readable storage medium storing a computer program, the computer program including program instructions, which, when executed by a processor of an electronic device, cause the processor to perform the steps of any of the above-described intelligent interactive methods with memory function.

[0157] One embodiment of the present invention also provides an electronic device, including: a processor and a memory, the memory being used to store computer program code, the computer program code including computer instructions, and when the processor executes the computer instructions, the electronic device performs the steps of any of the above-described intelligent interactive methods with memory function.

[0158] Please see Figure 6 This is a schematic diagram of the hardware structure of an electronic device provided in an embodiment of the present invention.

[0159] The electronic device 3 includes a processor 31, a memory 32, an input device 33, and an output device 34. The processor 31, memory 32, input device 33, and output device 34 are coupled together via connectors, which may include various interfaces, transmission lines, or buses, etc., and are not limited in this respect in the embodiments of the present invention. It should be understood that in the various embodiments of the present invention, coupling refers to mutual connection through a specific method, including direct connection or indirect connection through other devices, such as through various interfaces, transmission lines, buses, etc.

[0160] The processor 31 can be one or more graphics processing units (GPUs). If the processor 31 is a GPU, the GPU can be a single-core GPU or a multi-core GPU. Optionally, the processor 31 can be a processor group composed of multiple GPUs, with the multiple processors coupled to each other via one or more buses. Optionally, the processor can also be other types of processors, etc., and the embodiments of the present invention are not limited thereto.

[0161] The memory 32 can be used to store computer program 35, as well as various types of computer program code, including program code for executing the present invention. Optionally, the memory may include at least random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM), or compact disc read-only memory (CD-ROM), which is used for related instructions and data.

[0162] Input device 33 is used to input data and / or signals, and output device 34 is used to output data and / or signals. Output device 33 and input device 34 can be independent devices or an integrated device.

[0163] It is understood that in this embodiment of the invention, the memory 32 can be used not only to store related instructions, but also the specific data stored in the memory is not limited.

[0164] Understandable Figure 6 This is merely a simplified design of an electronic device. In practical applications, the electronic device may also include other necessary components, including, but not limited to, any number of input / output devices, processors, memories, etc., and all control units of underwater robots that can implement embodiments of the present invention are within the scope of protection of the present invention.

[0165] The above description is only a partial embodiment of this application and does not limit the scope of protection of this application. Any equivalent device or equivalent process transformation made based on the content of this application specification and drawings, or directly or indirectly applied to other related technical fields, are similarly included within the scope of patent protection of this application.

Claims

1. An intelligent interactive system, characterized in that, include: The data collection module is used to collect dialogue records between users and the intelligent interaction system; The data cleaning module is used to extract the user's behavioral experience from the dialogue records, mark the identified user's behavioral experience with timestamps to form memory text, and save it to the memory text library; wherein, the behavioral experience includes behavioral time and time information; A semantic feature extraction module is used to extract semantic features from the dialogue record; The semantic feature matching module is used to perform semantic feature similarity matching between the current user input text and the memory texts stored in the memory text library to determine the memory text with the highest similarity to the current user input text; wherein, the memory texts contain the user's historical behavior and interaction experience; The prompt word construction module is used to combine the memory text with the highest similarity to the current user input text determined by the semantic feature matching module with the user's current input text to construct prompt words for triggering user recall; and The large language model invocation module is used to invoke the large language model based on the prompt words constructed by the prompt word construction module to generate a response to the user's current input.

2. The intelligent interactive system according to claim 1, characterized in that, The prompt word construction module is used for: Receive the memory text that has the highest similarity to the current user input text determined by the semantic feature matching module, and the user's current input; Analyze the user's current input to understand the user's needs, intentions, and emotional state; Extract key information from the most similar memory text; wherein, the key information is related to the user's historical interactions; Design prompts by combining the analysis results of the user's current input with key information from the memorized text.

3. The intelligent interactive system according to claim 2, characterized in that, The data cleaning module is used for: The user's behavioral history was extracted from the aforementioned conversation records; Generate a corresponding timestamp based on the time information in the user's behavioral experience, and combine the user's behavioral experience with the timestamp to construct a memory text; as well as The memory text is stored in the memory text library.

4. The intelligent interactive system according to claim 3, characterized in that, The timestamp records the exact time information of the behavioral event and / or a relative time description.

5. The intelligent interactive system according to claim 1, characterized in that, The semantic feature matching module is used for: Receive the user's current input text and convert it into a semantic feature vector; The same semantic feature extraction is performed on each segment of memory text in the memory text library to form a semantic feature vector; The semantic feature vector of the current input text is compared with the memory text vector in the memory text library. The memory text with the highest similarity is selected as the reference for generating the response.

6. The intelligent interactive system according to claim 5, characterized in that, The semantic features include, but are not limited to, word vectors, topic models, and sentiment analysis.

7. The intelligent interactive system according to claim 5, characterized in that, The semantic feature extraction module extracts semantic features from the dialogue records; wherein, the dialogue records include, but are not limited to, text input by the user, reply text generated by the system based on the user input, historical records of previous interactions between the user and the system, text, images or videos in documents uploaded by the user, and posts on social media.

8. An intelligent interaction method, characterized in that, The method includes: Collect user conversation records with intelligent interaction systems; The user's behavioral experience is extracted from the dialogue records, and the identified user behavioral experience is timestamped to form a memory text, which is then saved to the memory text library; wherein, the behavioral experience includes behavioral time and time information; Extract semantic features from the dialogue records; The current user input text is matched with the memory texts stored in the memory text database using semantic feature similarity to determine the memory text with the highest similarity to the current user input text; wherein, the memory texts contain the user's historical behavior and interaction experience; By combining the identified memory text with the highest similarity to the current user input text with the current user input text, a prompt word is constructed to trigger the user's recall; and The prompt words constructed by the prompt word construction module are used to call the large language model to generate a response to the user's current input.

9. The intelligent interaction method according to claim 8, characterized in that, The determined memory text with the highest similarity to the current user input text is combined with the current user input text to construct prompt words to trigger the user's recall, specifically including: Receive the memory text that is most similar to the current user input text and the user's current input; Analyze the user's current input to understand the user's needs, intentions, and emotional state; Extract key information from the most similar memory text; wherein, the key information is related to the user's historical interactions; Design prompts by combining the analysis results of the user's current input with key information from the memorized text.

10. The intelligent interaction method according to claim 9, characterized in that, The user's behavioral history is extracted from the dialogue records, and the identified user behavioral history is timestamped to form a memory text, which is then saved to the memory text library. Specifically, this includes: The user's behavioral history was extracted from the aforementioned conversation records; Generate a corresponding timestamp based on the time information in the user's behavioral experience, and combine the user's behavioral experience with the timestamp to construct a memory text; The memory text is stored in the memory text library.

11. The intelligent interaction method according to claim 10, characterized in that, The timestamp records the exact time information of the behavioral event and / or a relative time description.

12. The intelligent interaction method according to claim 8, characterized in that, The current user input text is matched with the stored memory texts in the memory text database using semantic feature similarity to determine the memory text with the highest similarity to the current user input text. Specifically, this includes: Receive the user's current input text and convert it into a semantic feature vector; The same semantic feature extraction is performed on each segment of memory text in the memory text library to form a semantic feature vector; The semantic feature vector of the current input text is compared with the memory text vector in the memory text library. The memory text with the highest similarity is selected as the reference for generating the response.

13. The intelligent interactive system according to claim 12, characterized in that, The semantic features include, but are not limited to, word vectors, topic models, and sentiment analysis.

14. The intelligent interactive system according to claim 12, characterized in that, Semantic features are extracted from the dialogue records, which include, but are not limited to, text entered by the user, reply text generated by the system based on the user's input, history of previous interactions between the user and the system, text, images or videos in documents uploaded by the user, and posts on social media.

15. A readable storage medium storing a computer program, characterized in that, The computer program includes program instructions that, when executed by the processor of the electronic device, cause the processor to perform the steps of the intelligent interaction method as described in any one of claims 8 to 14.

16. An electronic device comprising: A processor and a memory, characterized in that the memory is used to store computer program code, the computer program code including computer instructions, wherein when the processor executes the computer instructions, the electronic device performs the steps of the intelligent interaction method as described in any one of claims 8 to 14.